Thank you Piotr for this great article! It is very nicely explained and coded.

1 min readOct 21, 2018

I like how you created the nn_architecture using a dictionary, reminds of Tensorflow (each output connects to next input and have to match).

I do have one quesiton, regarding the bias, why is the shape always (layer_output_size, 1)?

def init_layers(self, seed = None):
    # Initialize weights and biases
    # wegiths have size (n_prev, n_hidden)
    
    
    # put random seed
    np.random.seed(seed)
    
    # get number of layers
    number_of_layers = len(self.nn_architecture)
    
    params_values = {}for idx, layer in enumerate(self.nn_architecture):
        layer_idx = idx + 1
        
        layer_input_size = layer["input_dim"]
        
        layer_output_size = layer["output_dim"]params_values['W' + str(layer_idx)] = np.random.randn(
            layer_input_size, layer_output_size) * 0.1
        params_values['b' + str(layer_idx)] = np.random.randn(
            self.batch, layer_output_size) * 0.1
        
    self.params_values = params_values
    return params_values

Does this make sense?

Batch size is the number of examples we feed at once, so the size of the bias should be reflected by the batch size? bias shape (batch, layer_output_size)?

Written by George Mihaila

Responses (1)