Boston Housing Problem

Description

The Boston Housing Problem is a well-known regression task to predict the median value of owner-occupied homes in Boston suburbs based on features like crime rate, number of rooms, property tax, and accessibility to employment centers. This task introduces learners to supervised regression modeling and evaluation metrics such as RMSE, MAE, and R².

Dataset

Below is the dataset used for the Boston Housing task. The table displays features like crime rate, number of rooms, and accessibility indexes, along with the median home value (in $1000s), the last columns, as the target variable.

CRIM	ZN	INDUS	CHAS	NOX	RM	AGE	DIS	RAD	TAX	PTRATIO	B	LSTAT	MEDV

Column Legend

CRIM: Per capita crime rate by town
ZN: Proportion of residential land zoned for lots over 25,000 sq. ft.
INDUS: Proportion of non-retail business acres per town
CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise)
NOX: Nitric oxide concentration (parts per 10 million)
RM: Average number of rooms per dwelling
AGE: Proportion of owner-occupied units built before 1940
DIS: Weighted distances to five Boston employment centers
RAD: Index of accessibility to radial highways
TAX: Full-value property-tax rate per $10,000
PTRATIO: Pupil-teacher ratio by town
B: 1000(Bk - 0.63)^2, where Bk is the proportion of Black residents
LSTAT: Percentage of lower status of the population
MEDV: Median value of owner-occupied homes in $1000s

Visualize Data

Use the visualizations below to explore the dataset. Select a feature from the dropdown menu to see its distribution and trends.

Select Feature:

Model

Configure the architecture of the neural network model for training. The number of units in a layer represents the number of neurons in that layer, which determines its capacity to learn patterns from data. Activation functions define how each neuron processes the input it receives.

Layer 1 Units:

Layer 1 Activation:

Layer 2 Units:

Layer 2 Activation:

Model Training

In this section, configure the parameters to train the neural network. The train/test split ratio determines how much of the data is used for training versus testing the model. The learning rate controls how much the model adjusts during training. The number of epochs defines how many complete passes the model makes over the training dataset. Additionally, the optimizer defines the algorithm used to minimize the loss function.

Train/Test Split Ratio: 80% Training, 20% Testing

Learning Rate: 0.01

Max Epochs:

Optimizer:

Epoch #: 0 Loss: N/A

Model Evaluation

Evaluate the trained model using regression metrics such as RMSE, MAE, and R². These metrics provide a comprehensive understanding of the model's performance in predicting home values.

RMSE: N/A

MAE: N/A

R²: N/A

Prediction

Use this section to predict the median home value based on selected input features. Some fields are predefined, while others can be manually entered for customization. The prediction will be displayed based on the trained model.

Predefined Configurations:

CRIM (Per Capita Crime Rate):

RM (Average Number of Rooms):

TAX (Property Tax Rate):

PTRATIO (Pupil-Teacher Ratio):

Other Input Features

ZN: 0
INDUS: 7.07
CHAS: 0
NOX: 0.469
AGE: 78.9
DIS: 4.9671
B: 396.9
LSTAT: 9.14