Spaces:
Sleeping
Sleeping
| license: cc-by-4.0 | |
| task_categories: | |
| - tabular-regression | |
| - time-series-forecasting | |
| language: | |
| - fr | |
| tags: | |
| - agriculture | |
| - herbicides | |
| - weed-pressure | |
| - crop-rotation | |
| - france | |
| - bretagne | |
| - sustainability | |
| - precision-agriculture | |
| - ift | |
| - treatment-frequency-index | |
| size_categories: | |
| - 1K<n<10K | |
| pretty_name: "Station Expérimentale de Kerguéhennec - Agricultural Interventions" | |
| configs: | |
| - config_name: default | |
| data_files: | |
| - split: train | |
| path: "*.csv" | |
| # 🚜 Station Expérimentale de Kerguéhennec - Agricultural Interventions Dataset | |
| ## Dataset Description | |
| This dataset contains comprehensive agricultural intervention records from the Station Expérimentale de Kerguéhennec in Brittany, France, spanning from 2014 to 2024. The data provides detailed insights into agricultural practices, crop rotations, herbicide treatments, and field management operations across 100 different plots. | |
| ## Dataset Summary | |
| - **Source**: Station Expérimentale de Kerguéhennec, Brittany, France | |
| - **Time Period**: 2014-2024 (10 years) | |
| - **Location**: Brittany (Bretagne), France | |
| - **Records**: 4,663 intervention records | |
| - **Plots**: 100 unique agricultural parcels | |
| - **Crops**: 42 different crop types | |
| - **Format**: CSV exports from farm management system | |
| - **Language**: French (field names and crop types) | |
| ## Primary Use Cases | |
| This dataset is particularly valuable for: | |
| 1. **🌿 Weed Pressure Analysis**: Calculate and predict Treatment Frequency Index (IFT) for herbicides | |
| 2. **🔄 Crop Rotation Optimization**: Analyze the impact of different crop sequences on pest pressure | |
| 3. **🌱 Sustainable Agriculture**: Support reduction of herbicide use while maintaining productivity | |
| 4. **🎯 Precision Agriculture**: Identify suitable plots for sensitive crops (peas, beans) | |
| 5. **📊 Agricultural Research**: Study relationships between farming practices and outcomes | |
| 6. **🤖 Machine Learning**: Train models for agricultural prediction and decision support | |
| ## Data Structure | |
| ### Core Fields | |
| | Field | Description | Type | Example | | |
| |-------|-------------|------|---------| | |
| | `millesime` | Year of intervention | Integer | 2024 | | |
| | `nomparc` | Plot/field name | String | "Etang Milieu" | | |
| | `surfparc` | Plot surface area (hectares) | Float | 2.28 | | |
| | `libelleusag` | Crop type/usage | String | "pois de conserve" | | |
| | `datedebut` | Intervention start date | Date | "20/2/24" | | |
| | `datefin` | Intervention end date | Date | "20/2/24" | | |
| | `libevenem` | Intervention type | String | "Semis classique" | | |
| | `familleprod` | Product family | String | "Herbicides" | | |
| | `produit` | Specific product used | String | "CALLISTO" | | |
| | `quantitetot` | Total quantity applied | Float | 1.5 | | |
| | `unite` | Unit of measurement | String | "L" | | |
| ### Derived Fields (Added During Processing) | |
| | Field | Description | Type | | |
| |-------|-------------|------| | |
| | `year` | Standardized year | Integer | | |
| | `crop_type` | Standardized crop classification | String | | |
| | `is_herbicide` | Boolean flag for herbicide treatments | Boolean | | |
| | `is_fungicide` | Boolean flag for fungicide treatments | Boolean | | |
| | `is_insecticide` | Boolean flag for insecticide treatments | Boolean | | |
| | `plot_name` | Standardized plot name | String | | |
| | `intervention_type` | Standardized intervention classification | String | | |
| ## Key Statistics | |
| ### Temporal Coverage | |
| - **Years**: 2014-2024 (missing 2017 due to data format issues) | |
| - **Seasons**: All agricultural seasons represented | |
| - **Frequency**: Multiple interventions per plot per year | |
| ### Spatial Coverage | |
| - **Plots**: 100 unique agricultural parcels | |
| - **Surface**: Variable plot sizes (0.43 to 5+ hectares) | |
| - **Location**: Single experimental station (controlled conditions) | |
| ### Intervention Types | |
| - **Herbicide applications**: 800+ treatments | |
| - **Total interventions**: 4,663 records | |
| - **Product families**: Herbicides, Fungicides, Insecticides, Fertilizers | |
| - **Most common crops**: Wheat, Corn, Rapeseed | |
| ## Treatment Frequency Index (IFT) | |
| ### Definition | |
| The IFT (Indice de Fréquence de Traitement) is a key metric calculated as: | |
| ``` | |
| IFT = Number of applications / Plot surface area | |
| ``` | |
| ### Interpretation | |
| - **IFT < 1.0**: Low weed pressure (suitable for sensitive crops) | |
| - **IFT 1.0-2.0**: Moderate pressure (monitoring required) | |
| - **IFT > 2.0**: High pressure (intervention needed) | |
| ### Dataset Statistics | |
| - **Mean IFT**: 1.93 (moderate pressure) | |
| - **Range**: 0.14 - 6.67 | |
| - **Trend**: Decreasing from 2.91 (2014) to 1.74 (2024) | |
| ## Data Quality | |
| ### Completeness | |
| - **Core fields**: 95%+ completeness for essential variables | |
| - **Date fields**: Well-formatted and consistent | |
| - **Numeric fields**: Validated ranges and units | |
| - **Geographic data**: Anonymized but consistent plot identifiers | |
| ### Validation | |
| - **Cross-references**: Product codes validated against official databases | |
| - **Temporal consistency**: Logical intervention sequences | |
| - **Agronomic validity**: Realistic crop rotations and treatment patterns | |
| ### Limitations | |
| - **Geographic scope**: Single experimental station (limited geographic diversity) | |
| - **Weather data**: Not included (external source required) | |
| - **Economic data**: Treatment costs not provided | |
| - **Soil characteristics**: Limited soil type information | |
| ## Ethical Considerations | |
| ### Privacy Protection | |
| - **Location data**: Generalized to protect farm location | |
| - **Personal information**: All farmer identifying data removed | |
| - **Commercial sensitivity**: Product usage patterns aggregated when appropriate | |
| ### Bias Considerations | |
| - **Geographic bias**: Limited to Brittany region | |
| - **Temporal bias**: Recent years may have different practices | |
| - **Selection bias**: Experimental station may not represent typical farms | |
| - **Technology bias**: Practices may reflect research station capabilities | |
| ## Applications | |
| ### 1. Weed Pressure Prediction | |
| Use machine learning models to predict future IFT values based on: | |
| - Historical treatment patterns | |
| - Crop rotation sequences | |
| - Environmental factors | |
| - Plot characteristics | |
| **Example Model Performance**: | |
| - Random Forest Regressor: R² = 0.65-0.85 | |
| - Features: Year, plot surface, previous IFT, crop type, rotation sequence | |
| ### 2. Sustainable Plot Selection | |
| Identify plots suitable for sensitive crops (peas, beans) by: | |
| - Analyzing historical IFT trends | |
| - Evaluating rotation impacts | |
| - Assessing risk levels for future years | |
| ### 3. Crop Rotation Optimization | |
| Optimize rotation sequences through: | |
| - Impact analysis of different crop sequences | |
| - Identification of beneficial rotations | |
| - Risk assessment for specific transitions | |
| **Best Rotations (Lowest IFT)**: | |
| 1. Peas → Rapeseed: IFT 0.62 | |
| 2. Winter Barley → Rapeseed: IFT 0.64 | |
| 3. Corn → Spring Barley: IFT 0.69 | |
| ### 4. Herbicide Alternative Analysis | |
| Support reduction strategies through: | |
| - Product usage pattern analysis | |
| - Temporal trend identification | |
| - Alternative strategy development | |
| ## Code Examples | |
| ### Loading the Dataset | |
| ```python | |
| from datasets import load_dataset | |
| # Load the dataset | |
| dataset = load_dataset("HackathonCRA/2024") | |
| # Convert to pandas for analysis | |
| import pandas as pd | |
| df = dataset["train"].to_pandas() | |
| print(f"Loaded {len(df)} intervention records") | |
| print(f"Covering {df['year'].nunique()} years") | |
| ``` | |
| ### Calculate IFT | |
| ```python | |
| # Calculate IFT for herbicide applications | |
| herbicides = df[df['familleprod'].str.contains('Herbicides', na=False)] | |
| ift_data = herbicides.groupby(['plot_name', 'year', 'crop_type']).agg({ | |
| 'quantitetot': 'sum', | |
| 'produit': 'count', # Number of applications | |
| 'surfparc': 'first' | |
| }).reset_index() | |
| ift_data['ift'] = ift_data['produit'] / ift_data['surfparc'] | |
| ``` | |
| ### Analyze Crop Rotations | |
| ```python | |
| # Create rotation sequences | |
| rotations = [] | |
| for plot in df['plot_name'].unique(): | |
| plot_data = df[df['plot_name'] == plot].sort_values('year') | |
| crops = plot_data.groupby('year')['crop_type'].first() | |
| for i in range(len(crops)-1): | |
| rotation = f"{crops.iloc[i]} → {crops.iloc[i+1]}" | |
| rotations.append({ | |
| 'plot': plot, | |
| 'year_from': crops.index[i], | |
| 'year_to': crops.index[i+1], | |
| 'rotation': rotation | |
| }) | |
| rotation_df = pd.DataFrame(rotations) | |
| ``` | |
| ## Related Datasets | |
| - **Weather Data**: Consider integrating with Météo-France data for enhanced analysis | |
| - **Soil Data**: European Soil Database for soil type information | |
| - **Economic Data**: Agricultural input cost databases | |
| - **Regulatory Data**: AMM (Marketing Authorization) product databases | |
| ## Citation | |
| If you use this dataset in your research, please cite: | |
| ```bibtex | |
| @dataset{hackathon_cra_2024, | |
| title={Station Expérimentale de Kerguéhennec Agricultural Interventions Dataset}, | |
| author={Hackathon CRA Team}, | |
| year={2024}, | |
| publisher={Hugging Face}, | |
| url={https://huggingface.co/datasets/HackathonCRA/2024}, | |
| note={Agricultural intervention data from Brittany, France (2014-2024)} | |
| } | |
| ``` | |
| ## License | |
| This dataset is released under CC-BY-4.0 license, allowing for both commercial and research use with proper attribution. | |
| ## Updates and Versioning | |
| - **Version 1.0**: Initial release with 2014-2024 data | |
| - **Future versions**: May include additional years or enhanced metadata | |
| - **Quality improvements**: Ongoing validation and cleaning | |
| ## Contact | |
| For questions about this dataset, collaboration opportunities, or data corrections, please use the Hugging Face dataset discussion feature or contact the research team through the repository. | |
| --- | |
| **Keywords**: agriculture, herbicides, crop rotation, sustainable farming, France, Brittany, IFT, weed management, precision agriculture, time series, regression, treatment frequency | |