A series about creating a model using Python and Tensorflow and then importing the model and making predictions
using Javascript in a Vue.js application, above is the vid and below you will find some useful notes.
-
In this post I am going to talk about how to create a model in python, pre-process a dataset I've already created, train a model, post-process, predict, and finally create different files for sharing some information about the data for later use.Then in part 2 called House price prediction 2/4: Using Tensorflow.js, Vue.js and Javascript I will take the model, the data for pre and post processing and finally predict using Vue.js.Then in part 3 I will show how does one hot encoding works.And finally in part 4 normalizing the inputs and its importance.If you want to see a simpler model and how it integrates with a javascript application using Tensorflow.js and Vue.js you can check my previous post: How to import a Keras model into a Vue.js application using Tensorflow.Js, where I also show how to publish the web site into Github Pages.
- 1.
Pre-reqs
-
Have Python 3.x installedHave Tensorflow installedHave Anaconda Installed (Optional)
-
- 2.
The Dataset
-
The data was organized in a csv file with the price as the output and size, rooms, baths, parking and neighborhood as the inputs
price,size,rooms,baths,parking,neighborhood 270000000,180.0,5,2.0,0,medellin aranjuez 280000000,168.0,5,3.0,0,medellin centro 350000000,95.0,3,2.0,0,medellin belen rosales 350000000,103.0,3,3.0,1,medellin la castellana 310000000,95.0,3,2.0,0,medellin la castellana ...
-
I defined a couple of variables to hold the column names for the inputs, outputs and also for the categorical column
X_colum_names = ['size', 'rooms', 'baths', 'parking', 'neighborhood'] Y_colum_names = ['price'] categorical_column = 'neighborhood'
-
Loaded the Dataset using Pandas
CSV_PATH = "./dataset/dataset.csv" df = pd.read_csv(CSV_PATH, index_col=False) print(df.head()) print(df.columns)
-
Split the features/columns between inputs (X) and outputs (Y)
X = df[common.X_colum_names] Y = df[common.Y_colum_names] print(X.head(), Y.head())
-
Created the one hot encoder, scalers and split the dataset into groups
#%% Configure categorical columns label_encoder, onehot_encoder = common_categorical.create_categorical_feature_encoder(X[common.categorical_column]) #%% Scale data # Create scaler so that the data is in the same range x_scaler, y_scaler = common_scaler.create_scaler(X.values[:,0:4], Y.values) #%% Split the dataset into different groups X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=42)
-
Then the utilities for pre and post processing the inputs are used
arr_x_train, arr_y_train = common_pre_post_processing.transform_inputs(X_train, label_encoder, onehot_encoder, x_scaler, y_train, y_scaler) arr_x_valid, arr_y_valid = common_pre_post_processing.transform_inputs(X_test, label_encoder, onehot_encoder, x_scaler, y_test, y_scaler)
-
- 3.
Training the model
-
First I create the model
#%% Create the model def build_model(x_size, y_size): model = Sequential() model.add(Dense(100, input_shape=(x_size,))) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(y_size)) model.compile(loss='mean_squared_error', optimizer=Adam(), metrics=[metrics.mae]) return(model) print(arr_x_train.shape[1], arr_y_train.shape[1]) model = build_model(arr_x_train.shape[1], arr_y_train.shape[1]) model.summary()
-
The I built the model
model = build_model(arr_x_train.shape[1], arr_y_train.shape[1]) model.summary()
-
We fit the model
history = model.fit(arr_x_train, arr_y_train, batch_size=batch_size, epochs=epochs, shuffle=True, verbose=2, validation_data=(arr_x_valid, arr_y_valid), callbacks=keras_callbacks)
-
And finally we save the model
model.save(common.model_file_name)
-
- 4.
Predicting using the model
-
I load the model and also the Onehot Encoder and Scalers
# Load the previous state of the model model = load_model(common.model_file_name) # Load the previous state of the enconders and scalers label_encoder, onehot_encoder = common_categorical.load_categorical_feature_encoder() x_scaler, y_scaler = common_scaler.load_scaler()
-
I define some simple values for testing purposes
#Some inputs to predict # 'size', 'rooms', 'baths', 'parking', 'neighborhood' values = [ [180, 5, 2, 0, 'envigado'], [180, 5, 2, 0, 'medellin belen'], [180, 5, 2, 0, 'sabaneta zaratoga'], #310000000,97,3,2,2,sabaneta centro [ 97, 3, 2, 2, 'sabaneta centro'], #258000000,105,3,2,0,medellin belen [105, 3, 2, 0, 'medellin belen'], #335000000,160,3,3,2,medellin la mota [160, 3, 3, 2, 'medellin la mota'], ]
-
Pre process the inputs
# Transform inputs to the format that the model expects model_inputs, _ = common_pre_post_processing.transform_inputs(values, label_encoder, onehot_encoder, x_scaler)
-
Predict using the model
# Use the model to predict the price for a house y_predicted = model.predict(model_inputs)
-
Post process the output
# Transform the results into a user friendly representation y_predicted_unscaled = common_pre_post_processing.transform_outputs(y_predicted, y_scaler)
-
and finally print the results
print('Results when:') print('Scale Input Features = ', common.scale_features_input) print('Scale Output Features = ', common.scale_features_output) print('Use Categorical Feature Eencoder = ', common.use_categorical_feature_encoder) for i in range(0, len(values)): print(values[i][4], y_predicted[i][0], int(y_predicted_unscaled[i]))
-
- 5.
Sharing dataGiven that we need to pass some data to the future Tensorflow.js application we need to make sure we have the important values available in a forma that is simple to access
-
Export the categories
# Load the previous state of the enconders label_encoder, onehot_encoder = common_categorical.load_categorical_feature_encoder() enconder_classes = list(label_encoder.classes_) common_file.generate_json_file(enconder_classes, common.root_share_folder, 'neighborhoods') print(enconder_classes)
["envigado", "envigado abadia", "envigado aburra sur", "envigado alcal", "envigado alcala", "envigado alquerias de san isidro", "envigado alto de las flores", "envigado alto de misael", "envigado altos de misael", "envigado andalucia", "envigado antillas", "envigado av poblado", "envigado b margaritas", "envigado barrio mesa", "envigado barrio obrero"
-
Export the information for the scaler for the inputs
x_scaler, y_scaler = common_scaler.load_scaler() mean_x = x_scaler.mean_ var_x = x_scaler.var_ common_file.generate_json_file(list(mean_x), common.root_share_folder, 'scaler-mean-x') common_file.generate_json_file(list(var_x), common.root_share_folder, 'scaler-var-x')
scaler-mean-x [114.6902816399287, 3.4014260249554367, 2.3436720142602496, 0.5928698752228164] scaler-var-x [572.865809011588, 0.37646855468812057, 0.2255615608745523, 0.4053680561513214]
-
and also Export the information for the scaler for the outputs
mean_y = y_scaler.mean_ var_y = y_scaler.var_ common_file.generate_json_file(list(mean_y), common.root_share_folder, 'scaler-mean-y') common_file.generate_json_file(list(var_y), common.root_share_folder, 'scaler-var-y')
scaler-mean-y [281340671.3761141] scaler-var-y [3091863947148531.5]
-
Use the Tensorflowjs Converter to transform the model so that it can be imported into Tensorflowjs
tensorflowjs_converter --input_format keras ./model/-inputsscaled-outputsscaled-categorical/model.h5 ./shared/model
-
And that's it! phew! Let me know if you have any questions!
-
- 6.
Comentarios
Publicar un comentario