Xavier Initialization

Paper Review

Understanding the difficulty of training deep feedforward neural networks

[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] ABSTRACT Random Initialization์„ ์‚ฌ์šฉํ•œ ์ผ๋ฐ˜์ ์ธ Gradient-descent ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด Deep neural network์—์„œ ์•ฝํ•œ ์„ฑ๋Šฅ์„ ๋‚ด๋Š”๊ฐ€ Random Initialization์„ ์ ์šฉํ•œ Logistic sigmoid ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ํ‰๊ท ๊ฐ’ ๋•Œ๋ฌธ์— Deep network์— ์ ํ•ฉํ•˜์ง€ ์•Š๋‹ค. ์ƒ์œ„ layer๋ฅผ ํฌํ™”(saturation)ํ•˜๊ฒŒ ๋งŒ๋“ ๋‹ค ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ƒ๋‹นํžˆ ๋น ๋ฅธ ์ˆ˜๋ ด์„ ๊ฐ€์ ธ์˜ค๋Š” ์ƒˆ๋กœ์šด Initialization Scheme๋ฅผ ๋„์ž…ํ•œ๋‹ค. Deep Neural Networks ๋”ฅ๋Ÿฌ๋‹์€ ์ถ”์ถœํ•œ ํŠน์ง•์„ ์ด์šฉํ•˜์—ฌ ํŠน์ง• ๊ณ„์ธต์„ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜์—ฌ ์ง„ํ–‰ํ•œ๋‹ค. ์ถ”์ถœํ•œ ํŠน์ง• : ๋‚ฎ์€ ์ˆ˜์ค€์˜ Feature๋“ค์˜ ํ•ฉ์„ฑ์„ ํ†ตํ•ด ๋งŒ๋“ค์–ด์ง„ ๋†’์€ ์ˆ˜์ค€์˜ Layer๋กœ ๋ถ€ํ„ฐ ์ถ”..

velpegor
'Xavier Initialization' ํƒœ๊ทธ์˜ ๊ธ€ ๋ชฉ๋ก