UDI | Product ID | Type | Air temperature [K] | Process temperature [K] | Rotational speed [rpm] | Torque [Nm] | Tool wear [min] | Target | Failure Type | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | M14860 | M | 298.1 | 308.6 | 1551 | 42.8 | 0 | 0 | No Failure |
1 | 2 | L47181 | L | 298.2 | 308.7 | 1408 | 46.3 | 3 | 0 | No Failure |
2 | 3 | L47182 | L | 298.1 | 308.5 | 1498 | 49.4 | 5 | 0 | No Failure |
3 | 4 | L47183 | L | 298.2 | 308.6 | 1433 | 39.5 | 7 | 0 | No Failure |
4 | 5 | L47184 | L | 298.2 | 308.7 | 1408 | 40.0 | 9 | 0 | No Failure |
Uncovering Patterns and Anomalies in Manufacturing Data
INFO 523 - Final Project
Project Description:
- Β Uncovering Patterns and Anomalies in Manufacturing Data
π―Goals:
The construction of modern factories is resulting in the generation of vast amounts of data. Manufacturing equipment continuously monitors various parameters, such as temperatures, vibrations, motor speeds, and energy consumption, using sensors and other methods. Variations in these parameters can indicate shifts in performance, potentially leading to defects or catastrophic failures in the equipment. Detecting these shifts has become increasingly important to reduce downtime and boost productivity.
Advanced techniques such as machine learning, anomaly detection, and image analysis are currently being utilized to forecast when equipment might require maintenance, calibration, or material changes. This project aims to leverage synthetic public data from Kaggle to compare various classification and regression models, with the objective of predicting these critical events. If time allows, we will also explore anomaly detection techniques on time series data to predict potential failures as early as possible.
πProposed Datasets:
- Source: Kaggle - Machine Predictive Maintenance Classification (Synthetic dataset that reflects real predictive maintenance encountered in the industry)
Data Example:
- Source: Kaggle - Intelligent Manufacturing Dataset (The Intelligent Manufacturing Dataset for Predictive Optimization is a dataset designed for research in smart manufacturing, AI-driven process optimization, and predictive maintenance)
Data Example:
Timestamp | Machine_ID | Operation_Mode | Temperature_C | Vibration_Hz | Power_Consumption_kW | Network_Latency_ms | Packet_Loss_% | Quality_Control_Defect_Rate_% | Production_Speed_units_per_hr | Predictive_Maintenance_Score | Error_Rate_% | Efficiency_Status | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2024-01-01 00:00:00 | 39 | Idle | 74.137590 | 3.500595 | 8.612162 | 10.650542 | 0.207764 | 7.751261 | 477.657391 | 0.344650 | 14.965470 | Low |
1 | 2024-01-01 00:01:00 | 29 | Active | 84.264558 | 3.355928 | 2.268559 | 29.111810 | 2.228464 | 4.989172 | 398.174747 | 0.769848 | 7.678270 | Low |
2 | 2024-01-01 00:02:00 | 15 | Active | 44.280102 | 2.079766 | 6.144105 | 18.357292 | 1.639416 | 0.456816 | 108.074959 | 0.987086 | 8.198391 | Low |
3 | 2024-01-01 00:03:00 | 43 | Active | 40.568502 | 0.298238 | 4.067825 | 29.153629 | 1.161021 | 4.582974 | 329.579410 | 0.983390 | 2.740847 | Medium |
4 | 2024-01-01 00:04:00 | 8 | Idle | 75.063817 | 0.345810 | 6.225737 | 34.029191 | 4.796520 | 2.287716 | 159.113525 | 0.573117 | 12.100686 | Low |
ποΈProject Schedule
Week 1.
Definition of problem statement and goals
Plan to incorporate peer review feedback into project plan
Data cleaning. (handling missing, outliers, define imputation methods).
Define key response on the datasets, depending on the model (might want to look like defects pass/fail) for a classification model or defect rate for a regression model.Β
Week 2.Β
Analyze features (use PCA or others to understand which features contribute more the variability, etc.)
Classification Model Creation and Validation
Regression Model Creation and Validation
Comparing models and recommend the best one.Β
Week 3:
Assess anomaly detection and recommend future possible next steps.Β
Prepare final report and presentation
πProject Organization
| β πDATA: # Raw Data files obtained from Kaggle source in CSV format.
| ______|β- πprocessed: # Cleaned and processed datasets
| ______|β- πresults: # model evaluation and other results
| β πIMAGES: # Any images to be used by quarto site
| β πpresentation_files: # Quarto presentation files
| β πextra: # Additional documents or files used on project
| β πquarto: # quarto files
| β πsrc: # source code used for project
| β π.github: # github configuration files
| β πrequirements.txt: # Python Dependencies
| β π_quarto.yml # quarto metadata and configuration
| β π.gitignore # list of files and directories to be ignore by Git
| β πabout.qmd # Quarto about page with general information about the project
| β πpresentation.qmd # Quarto final project presentation
| β πproposal.qmd # Project Problem statement and proposal
| β πREADME.md # main read me file for git.