Ir al contenido principal

House price prediction 1/4: Using Keras/Tensorflow and python

A series about creating a model using Python and Tensorflow and then importing the model and making predictions using Javascript in a Vue.js application, above is the vid and below you will find some useful notes.
  • In this post I am going to talk about how to create a model in python, pre-process a dataset I've already created, train a model, post-process, predict, and finally create different files for sharing some information about the data for later use.
    Then in part 2 called House price prediction 2/4: Using Tensorflow.js, Vue.js and Javascript I will take the model, the data for pre and post processing and finally predict using Vue.js.
    Then in part 3 I will show how does one hot encoding works.
    And finally in part 4 normalizing the inputs and its importance.
    If you want to see a simpler model and how it integrates with a javascript application using Tensorflow.js and Vue.js you can check my previous post: How to import a Keras model into a Vue.js application using Tensorflow.Js, where I also show how to publish the web site into Github Pages.
  1. 1.

    Pre-reqs

    • Have Python 3.x installed
      Have Tensorflow installed
      Have Anaconda Installed (Optional)
  2. 2.

    The Dataset

    • The data was organized in a csv file with the price as the output and size, rooms, baths, parking and neighborhood as the inputs
      price,size,rooms,baths,parking,neighborhood
      270000000,180.0,5,2.0,0,medellin aranjuez
      280000000,168.0,5,3.0,0,medellin centro
      350000000,95.0,3,2.0,0,medellin belen rosales
      350000000,103.0,3,3.0,1,medellin la castellana
      310000000,95.0,3,2.0,0,medellin la castellana
      ...
      
    • I defined a couple of variables to hold the column names for the inputs, outputs and also for the categorical column
        
      X_colum_names = ['size', 'rooms', 'baths', 'parking', 'neighborhood']
      Y_colum_names = ['price']
      categorical_column = 'neighborhood'
        
      
    • Loaded the Dataset using Pandas
        
      CSV_PATH = "./dataset/dataset.csv"
      
      df = pd.read_csv(CSV_PATH, index_col=False)
      
      print(df.head())
      print(df.columns)
      
        
      
    • Split the features/columns between inputs (X) and outputs (Y)
        
      X = df[common.X_colum_names]
      Y = df[common.Y_colum_names]
      
      print(X.head(), Y.head())
        
      
    • Created the one hot encoder, scalers and split the dataset into groups
        
      #%% Configure categorical columns
      label_encoder, onehot_encoder = common_categorical.create_categorical_feature_encoder(X[common.categorical_column])
      
      #%% Scale data
      # Create scaler so that the data is in the same range
      x_scaler, y_scaler = common_scaler.create_scaler(X.values[:,0:4], Y.values)
      
      #%% Split the dataset into different groups
      X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=42)
        
      
    • Then the utilities for pre and post processing the inputs are used
        
      arr_x_train, arr_y_train = common_pre_post_processing.transform_inputs(X_train,
                                                         label_encoder,
                                                         onehot_encoder,
                                                         x_scaler,
                                                         y_train,
                                                         y_scaler)
      
      arr_x_valid, arr_y_valid = common_pre_post_processing.transform_inputs(X_test,
                                                         label_encoder,
                                                         onehot_encoder,
                                                         x_scaler,
                                                         y_test,
                                                         y_scaler)
        
      
  3. 3.

    Training the model

    • First I create the model
        
      #%% Create the model
      def build_model(x_size, y_size):
          model = Sequential()
          model.add(Dense(100, input_shape=(x_size,)))
          model.add(Activation('relu'))
          model.add(Dropout(0.5))
      
          model.add(Dense(y_size))
      
          model.compile(loss='mean_squared_error',
              optimizer=Adam(),
              metrics=[metrics.mae])
      
          return(model)
      
      print(arr_x_train.shape[1], arr_y_train.shape[1])
      
      model = build_model(arr_x_train.shape[1], arr_y_train.shape[1])
      model.summary()
        
      
    • The I built the model
        
      model = build_model(arr_x_train.shape[1], arr_y_train.shape[1])
      model.summary()
        
      
    • We fit the model
        
      history = model.fit(arr_x_train, arr_y_train,
          batch_size=batch_size,
          epochs=epochs,
          shuffle=True,
          verbose=2,
          validation_data=(arr_x_valid, arr_y_valid),
          callbacks=keras_callbacks)
        
      
    • And finally we save the model
        
      model.save(common.model_file_name)
        
      
  4. 4.

    Predicting using the model

    • I load the model and also the Onehot Encoder and Scalers
        
      # Load the previous state of the model
      model = load_model(common.model_file_name)
      
      # Load the previous state of the enconders and scalers
      label_encoder, onehot_encoder = common_categorical.load_categorical_feature_encoder()
      x_scaler, y_scaler = common_scaler.load_scaler()
        
      
    • I define some simple values for testing purposes
        
      #Some inputs to predict
      # 'size', 'rooms', 'baths', 'parking', 'neighborhood'
      values = [
          [180, 5, 2, 0, 'envigado'],
          [180, 5, 2, 0, 'medellin belen'],
          [180, 5, 2, 0, 'sabaneta zaratoga'],
          #310000000,97,3,2,2,sabaneta centro
          [ 97, 3, 2, 2, 'sabaneta centro'],
          #258000000,105,3,2,0,medellin belen
          [105, 3, 2, 0, 'medellin belen'],
          #335000000,160,3,3,2,medellin la mota
          [160, 3, 3, 2, 'medellin la mota'],
      ]
        
      
    • Pre process the inputs
        
      # Transform inputs to the format that the model expects
      model_inputs, _ = common_pre_post_processing.transform_inputs(values, label_encoder, onehot_encoder, x_scaler)
        
      
    • Predict using the model
        
      # Use the model to predict the price for a house
      y_predicted = model.predict(model_inputs)
        
      
    • Post process the output
        
      # Transform the results into a user friendly representation
      y_predicted_unscaled = common_pre_post_processing.transform_outputs(y_predicted, y_scaler)
        
      
    • and finally print the results
        
      print('Results when:')
      print('Scale Input Features = ', common.scale_features_input)
      print('Scale Output Features = ', common.scale_features_output)
      print('Use Categorical Feature Eencoder  = ', common.use_categorical_feature_encoder)
      
      for i in range(0, len(values)):
          print(values[i][4], y_predicted[i][0], int(y_predicted_unscaled[i]))
        
      
  5. 5.

    Sharing data

    Given that we need to pass some data to the future Tensorflow.js application we need to make sure we have the important values available in a forma that is simple to access
    • Export the categories
        
      # Load the previous state of the enconders
      label_encoder, onehot_encoder = common_categorical.load_categorical_feature_encoder()
      
      enconder_classes = list(label_encoder.classes_)
      
      common_file.generate_json_file(enconder_classes, common.root_share_folder, 'neighborhoods')
      
      print(enconder_classes)
        
      
        ["envigado", "envigado abadia", "envigado aburra sur", "envigado alcal", "envigado alcala", "envigado alquerias de san isidro", "envigado alto de las flores", "envigado alto de misael", "envigado altos de misael", "envigado andalucia", "envigado antillas", "envigado av poblado", "envigado b margaritas", "envigado barrio mesa", "envigado barrio obrero"
      
    • Export the information for the scaler for the inputs
        
      x_scaler, y_scaler = common_scaler.load_scaler()
      
      mean_x = x_scaler.mean_
      var_x = x_scaler.var_
      
      common_file.generate_json_file(list(mean_x), common.root_share_folder, 'scaler-mean-x')
      common_file.generate_json_file(list(var_x), common.root_share_folder, 'scaler-var-x')
        
      
        scaler-mean-x
        [114.6902816399287, 3.4014260249554367, 2.3436720142602496, 0.5928698752228164]
      
        scaler-var-x
        [572.865809011588, 0.37646855468812057, 0.2255615608745523, 0.4053680561513214]
      
    • and also Export the information for the scaler for the outputs
        
      mean_y = y_scaler.mean_
      var_y = y_scaler.var_
      
      common_file.generate_json_file(list(mean_y), common.root_share_folder, 'scaler-mean-y')
      common_file.generate_json_file(list(var_y), common.root_share_folder, 'scaler-var-y')
        
      
        scaler-mean-y
        [281340671.3761141]
      
        scaler-var-y
        [3091863947148531.5]
      
    • Use the Tensorflowjs Converter to transform the model so that it can be imported into Tensorflowjs
        
      tensorflowjs_converter --input_format keras ./model/-inputsscaled-outputsscaled-categorical/model.h5 ./shared/model
        
      
    • And that's it! phew! Let me know if you have any questions!
  6. 6.

Comentarios

Entradas populares de este blog

How to copy files from and to a running Docker container

Sometimes you want to copy files to or from a container that doesn’t have a volume previously created, in this quick tips episode, you will learn how. Above is the vid and below you will find some useful notes. 1. Pre-reqs Have Docker installed 2. Start a Docker container For this video I will be using a Jenkins image as an example, so let’s first download it by using docker pull docker pull jenkins/jenkins:lts

How to create an AEM component using Reactjs

In this tutorial, I will show how to use use Adobe's archetype to create an AEM application with React.js support and also how to add a new React.js component so that it can be added into a page, above is the vid and below you will find some useful notes. In the second part we will see how to configure the Sling Model for the AEM React component. 1. Pre-reqs Have access to an Adobe Experience Manager instance. You will need aem 6.4 Service Pack 2 or newer. Have Maven installed, understand how it works and also understand how to use Adobe's archetype, you can watch my video about maven here: Creating an AEM application using Maven and Adobe's archetype 2.

House price prediction 3/4: What is One Hot Encoding

A series about creating a model using Python and Tensorflow and then importing the model and making predictions using Javascript in a Vue.js application, above is the vid and below you will find some useful notes. Here, in part 3 of this series, I will show what is and how does one hot encoding works. In the first post, called House price prediction 1/4: Using Keras/Tensorflow and python , I talked about how to create a model in python, pre-process a dataset I've already created, train a model, post-process, predict, and finally about creating different files for sharing some information about the data for use on the second part. Then in part 2, called House price prediction 2/4: Using Tensorflow.js, Vue.js and Javascript , I took the model, the data for pre and post processing and after loading everything we were finally able to predict