Introduction
In this tutorial, we’ll create and train a machine learning model, or as we call it, anAI Table or a predictor. By querying the model, we’ll predict the
rental prices of the properties based on their attributes, such as the number of
rooms, area, or neighborhood.
Make sure you have access to a working MindsDB installation, either locally or
at MindsDB Cloud.
If you want to learn how to set up your account at MindsDB Cloud, follow
this guide. Another way is to set up
MindsDB locally using
Docker or
Python.
Let’s get started.
Data Setup
Connecting the Data
There are a couple of ways you can get the data to follow through with this tutorial.- Connecting as a database
- Connecting as a file
You can connect to a demo database that we’ve prepared for you. It contains the data used throughout this tutorial (the Now you can run queries directly on the demo database. Let’s preview the data that we’ll use to train our predictor.
example_db.demo_data.home_rentals table).Pay Attention to the Queries From now on, we’ll use the
example_db.demo_data.home_rentals table. Make sure you replace it with
files.home_rentals if you connect the data as a file.Understanding the Data
We use the home rentals dataset, where each row is one property, to predict therental_price column value for all the newly added properties.
Below is the sample data stored in the example_db.demo_data.home_rentals
table.
| Column | Description | Data Type | Usage | 
|---|---|---|---|
| number_of_rooms | Number of rooms in a property [0,1,2,3]. | integer | Feature | 
| number_of_bathrooms | Number of bathrooms in a property [1,2]. | integer | Feature | 
| sqft | Area of a property in square feet. | integer | Feature | 
| location | Rating of the location of a property [poor, great, good]. | character varying | Feature | 
| days_on_market | Number of days a property has been on the market. | integer | Feature | 
| neighborhood | Neighborhood [alcatraz_ave, westbrae, ..., south_side, thowsand_oaks]. | character varying | Feature | 
| rental_price | Rental price of a property in USD. | integer | Label | 
Labels and FeaturesA label is a column whose values will be predicted (the y variable in simple
linear regression).
A feature is a column used to train the model (the x variable in simple linear regression).
A feature is a column used to train the model (the x variable in simple linear regression).
Training a Predictor
Let’s create and train the machine learning model. For that, we use theCREATE MODEL statement and specify the
input columns used to train FROM (features) and what we want to
PREDICT (labels).
rental_price column,
whose values will be predicted.
Status of a Predictor
A predictor may take a couple of minutes for the training to complete. You can monitor the status of the predictor by using this SQL command:complete, we can start making
predictions!
Making Predictions
Making a Single Prediction
You can make predictions by querying the predictor as if it were a table. TheSELECT statement lets you make predictions for the label
based on the chosen features.
Making Batch Predictions
Also, you can make bulk predictions by joining a data table with your predictor usingJOIN.
What’s Next?
Have fun while trying it out yourself!- Bookmark MindsDB repository on GitHub.
- Sign up for a free MindsDB account.
- Engage with the MindsDB community on Slack or GitHub to ask questions and share your ideas and thoughts.