Boston Housing Problem

Description

The Boston Housing Problem is a well-known regression task to predict the median value of owner-occupied homes in Boston suburbs based on features like crime rate, number of rooms, property tax, and accessibility to employment centers. This task introduces learners to supervised regression modeling and evaluation metrics such as RMSE, MAE, and R².

Boston Housing

Dataset

Below is the dataset used for the Boston Housing task. The table displays features like crime rate, number of rooms, and accessibility indexes, along with the median home value (in $1000s), the last columns, as the target variable.

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTATMEDV

Column Legend

  • CRIM: Per capita crime rate by town
  • ZN: Proportion of residential land zoned for lots over 25,000 sq. ft.
  • INDUS: Proportion of non-retail business acres per town
  • CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise)
  • NOX: Nitric oxide concentration (parts per 10 million)
  • RM: Average number of rooms per dwelling
  • AGE: Proportion of owner-occupied units built before 1940
  • DIS: Weighted distances to five Boston employment centers
  • RAD: Index of accessibility to radial highways
  • TAX: Full-value property-tax rate per $10,000
  • PTRATIO: Pupil-teacher ratio by town
  • B: 1000(Bk - 0.63)^2, where Bk is the proportion of Black residents
  • LSTAT: Percentage of lower status of the population
  • MEDV: Median value of owner-occupied homes in $1000s

Visualize Data

Use the visualizations below to explore the dataset. Select a feature from the dropdown menu to see its distribution and trends.

Model

Configure the architecture of the neural network model for training. The number of units in a layer represents the number of neurons in that layer, which determines its capacity to learn patterns from data. Activation functions define how each neuron processes the input it receives.

Model Training

In this section, configure the parameters to train the neural network. The train/test split ratio determines how much of the data is used for training versus testing the model. The learning rate controls how much the model adjusts during training. The number of epochs defines how many complete passes the model makes over the training dataset. Additionally, the optimizer defines the algorithm used to minimize the loss function.

80% Training, 20% Testing
0.01
0 N/A

Model Evaluation

Evaluate the trained model using regression metrics such as RMSE, MAE, and R². These metrics provide a comprehensive understanding of the model's performance in predicting home values.

RMSE: N/A

MAE: N/A

R²: N/A

Prediction

Use this section to predict the median home value based on selected input features. Some fields are predefined, while others can be manually entered for customization. The prediction will be displayed based on the trained model.

Other Input Features

  • ZN: 0
  • INDUS: 7.07
  • CHAS: 0
  • NOX: 0.469
  • AGE: 78.9
  • DIS: 4.9671
  • B: 396.9
  • LSTAT: 9.14