batch normalization

Paper Review

Batch Normalization : Accelerating Deep Network Training byReducing Internal Covariate Shift

[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] ABSTRACT DNN์˜ ํ›ˆ๋ จ์€ ์ด์ „ layer์˜ ๋งค๊ฐœ ๋ณ€์ˆ˜๊ฐ€ ๋ณ€๊ฒฝ๋จ์— ๋”ฐ๋ผ ํ›ˆ๋ จ ์ค‘์— ๊ฐ layer์˜ ์ธํ’‹ ๋ถ„ํฌ๊ฐ€ ๋ณ€๊ฒฝ๋œ๋‹ค ์ด๋Š” ๋‚ฎ์€ learning rate์™€ ์‹ ์ค‘ํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ ์ดˆ๊ธฐํ™”๋ฅผ ์š”๊ตฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ›ˆ๋ จ ์†๋„๋ฅผ ๋Šฆ์ถ”๊ณ  ๋น„์„ ํ˜•์„ฑ์„ ๊ฐ€์ง„ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๊ฒƒ์ด ์–ด๋ ต๋‹ค ์ด ๋ฌธ์ œ๋ฅผ layer์˜ ์ž…๋ ฅ์˜ normalization์„ ํ†ตํ•ด ํ•ด๊ฒฐํ•œ๋‹ค. ๊ฐ mini batch์— ๋Œ€ํ•œ normalization์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์—์„œ ๋งŽ์€ ์žฅ์ ์„ ๊ฐ€์ง„๋‹ค. ๋ฐฐ์น˜ normalization์„ ํ†ตํ•ด ํ›จ์”ฌ ๋” ๋†’์€ ํ•™์Šต ์†๋„๋กœ ์‚ฌ์šฉํ•˜๊ณ  ์ดˆ๊ธฐํ™”์— ๋œ ๋ฏผ๊ฐํ•˜๋‹ค. Introduction ํ™•๋ฅ ์  ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•(SGD)๋Š” ์‹ฌ์ธต ๋„คํŠธ์›Œํฌ๋ฅผ ํ›ˆ๋ จํ•˜๋Š” ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ ์ž…์ฆ๋˜์—ˆ๋‹ค. SGD๋Š” ํ•™์Šต์„ ๊ฐ ๋‹จ๊ณ„๋ณ„๋กœ ์ง„ํ–‰ํ•˜๋ฉฐ, ๋ฏธ๋‹ˆ ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ..

velpegor
'batch normalization' ํƒœ๊ทธ์˜ ๊ธ€ ๋ชฉ๋ก