Apple Tree Cultivation Suitability Analysis Using Soil and Environmental Data

Project Overview:

This project focuses on analyzing soil and environmental data to determine the best locations for planting apple trees. By using data from both the soil and the environment, we've built predictive models and analysis tools to simulate suitable growing conditions for apple trees.

Project Objectives:

  • Integrate soil and environmental data to understand soil conditions and geographical features.

  • Use machine learning algorithms (clustering and regression) to simulate ideal growing conditions.

  • Predict soil pH levels and other key parameters to determine suitable areas for apple tree cultivation.

  • Create interactive maps to display the results in an engaging way.

Project Steps:

1. Data Collection:

We started by gathering soil data (including pH, nitrogen, phosphorus, and potassium percentages) and environmental data (such as latitude and longitude).

2. Data Analysis:

Using descriptive statistics, we examined the data to identify missing values and better understand the overall dataset.

3. Data Processing:

We merged the soil and environmental data based on geographical coordinates, which allowed us to analyze the relationship between soil characteristics and location.

4. Machine Learning Model Development:

  • Clustering: We used KMeans clustering to group similar geographical points based on soil characteristics and environmental factors.

  • Prediction: Using linear regression, we predicted the soil pH in various regions, helping us determine which areas are more suitable for apple tree cultivation.

5. Results:

  • Correlation Analysis: We examined the correlation between soil characteristics to see how they relate to each other.

  • Clustering Results: We identified and visualized groups of regions with similar characteristics, helping us categorize locations based on suitability.

  • Predictions: We generated predictions for soil pH across the region, highlighting areas that would be optimal for planting apple trees.

6. Interactive Maps:

Using Folium, we created interactive maps that show the results of our clustering and prediction models. Users can hover over the map points to see the suitability of each area for planting apple trees, based on the soil pH and other features.

Technologies Used:

  • Python: For data processing, analysis, and machine learning algorithms.

  • Libraries:

    • pandas: For data manipulation and analysis.

    • openpyxl: To work with Excel files.

    • numpy: For numerical computations.

    • sklearn: For machine learning models (KMeans clustering and linear regression).

    • folium: To create interactive maps for data visualization.

  • Flask: We are planning to integrate this project into a web application that will allow users to upload their own data files and see the results.

Future of the Project:

  • Improving the Models: We'll refine the models to make them more accurate and add new features for a more comprehensive analysis.


  • Enhancing User Experience: We're working on improving the web interface for better usability and faster processing of user-uploaded data.


  • Expanding the Scope: We plan to add more soil parameters and expand the analysis to other agricultural applications.

import pandas as pd
import openpyxl as openpyxl
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import folium

# Load soil data
soil_data = pd.read_excel('soil_data_apple.xlsx')

# Load environmental data
location_data = pd.read_excel('location_data_apple.xlsx')
# Display the first few rows of the data
print(soil_data.head())
print(location_data.head())

# Check for missing values in the data
print(soil_data.isnull())
print(location_data.isnull())

# Check general information about the data
print(soil_data.describe())
print(location_data.describe())

# Assume that the soil and environmental data are aligned with latitude and longitude columns
# Merge the two dataframes based on geographical location (latitude and longitude)
combined_data = pd.merge(soil_data, location_data, on=['Latitude', 'Longitude'])
print(combined_data.head())

# Descriptive statistics of the data
descriptive_stats = combined_data.describe()
print(descriptive_stats)

# Select only numeric columns
numeric_data = combined_data.select_dtypes(include=[np.number])

# Calculate the correlation matrix
corr_matrix = numeric_data.corr()

# Display the correlation matrix
print(corr_matrix)

# Select features for clustering to find locations with similar soil conditions
clustering_data = combined_data[['Latitude', 'Longitude', 'pH', 'N (%)', 'P (%)', 'K (%)']]

# Normalize the data (scaling is important for clustering algorithms)
scaler = StandardScaler()
clustering_data_scaler = scaler.fit_transform(clustering_data)

# Run the KMeans clustering algorithm
kmeans = KMeans(n_clusters=3, random_state=0)  # Number of clusters can be adjusted
combined_data['Cluster'] = kmeans.fit_predict(clustering_data_scaler)

# Display the clustering results
plt.scatter(combined_data['Longitude'], combined_data['Latitude'], c=combined_data['Cluster'], cmap='viridis')
plt.title('Clustering of Locations for Apple Trees')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.show()

# Predict suitable planting areas for apple trees
# Select features and target
X = combined_data[['Latitude', 'Longitude', 'N (%)', 'P (%)', 'K (%)']]  # Input features
y = combined_data['pH']  # Target feature (soil pH)

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the linear regression model
regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Make predictions
y_pred = regressor.predict(X_test)

# Evaluate the model 
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

# Predict pH for all locations
combined_data['Predicted pH'] = regressor.predict(X)

# Display the prediction results
plt.scatter(combined_data['Longitude'], combined_data['Latitude'], c=combined_data['Predicted pH'], cmap='coolwarm')
plt.colorbar(label='Predicted pH')
plt.title('Predicted pH for Apple Tree Planting Locations')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.show()

# Add a column for suitability or unsuitability for cultivation
# Assume predictions or clustering results are stored in the 'Cluster' or 'Prediction' column

def classify_for_cultivation(value):
    if value in [0, 1, 2]:  # For example, clusters or predictions suitable for cultivation
        return "Suitable for cultivation"
    else:  # Clusters or predictions unsuitable for cultivation
        return "Not suitable for cultivation"

# Assuming the 'Cluster' or 'Prediction' column exists
combined_data['Cultivation Suitability'] = combined_data['Cluster'].apply(classify_for_cultivation)

# Create a map centered on Karaj
m = folium.Map(location=[35.8322, 50.9917], zoom_start=12)

# Add markers for each point (Latitude, Longitude)
for index, row in combined_data.iterrows():
    folium.Marker(
        [row['Latitude'], row['Longitude']],
        popup=f"This point: {row['Cultivation Suitability']}\n pH value: {row['pH']}%"  # Check the name of the humidity column
    ).add_to(m)

# Save the map to an HTML file
m.save("apple_growing_map_with_suitability.html")

if you don't have GitHub account, here you are!!

Relevant Links and GitHub Repository

"To access the project code, visit the [GitHub Project](GitHub link). Here you can find all the code and datasets related to the project."

Relevant Links and GitHub Repository

"To access the project code, visit the [GitHub Project](GitHub link). Here you can find all the code and datasets related to the project."

Relevant Links and GitHub Repository

"To access the project code, visit the [GitHub Project](GitHub link). Here you can find all the code and datasets related to the project."

Upgrade your Business

with a Smile on your face!

Schedule a call with me ASAP!

Upgrade your Business

with a Smile on your face!

Schedule a call with me ASAP!

Upgrade your Web presence with Framer

Schedule a call with me ASAP!