Uncovering Patterns and Anomalies in Manufacturing Data

INFO 523 - Final Project

Project description

Author

Affiliation

Cesar Castro

College of Information Science, University of Arizona

Project Description:

Uncovering Patterns and Anomalies in Manufacturing Data

🎯Goals:

The construction of modern factories is resulting in the generation of vast amounts of data. Manufacturing equipment continuously monitors various parameters, such as temperatures, vibrations, motor speeds, and energy consumption, using sensors and other methods. Variations in these parameters can indicate shifts in performance, potentially leading to defects or catastrophic failures in the equipment. Detecting these shifts has become increasingly important to reduce downtime and boost productivity.

Advanced techniques such as machine learning, anomaly detection, and image analysis are currently being utilized to forecast when equipment might require maintenance, calibration, or material changes. This project aims to leverage synthetic public data from Kaggle to compare various classification and regression models, with the objective of predicting these critical events. If time allows, we will also explore anomaly detection techniques on time series data to predict potential failures as early as possible.

📊Proposed Datasets:

Source: Kaggle - Machine Predictive Maintenance Classification (Synthetic dataset that reflects real predictive maintenance encountered in the industry)

Data Example:

	UDI	Product ID	Type	Air temperature [K]	Process temperature [K]	Rotational speed [rpm]	Torque [Nm]	Tool wear [min]	Failure Type
0	1	M14860	M	298.1	308.6	1551	42.8	0	No Failure
1	2	L47181	L	298.2	308.7	1408	46.3	3	No Failure
2	3	L47182	L	298.1	308.5	1498	49.4	5	No Failure
3	4	L47183	L	298.2	308.6	1433	39.5	7	No Failure
4	5	L47184	L	298.2	308.7	1408	40.0	9	No Failure

Source: Kaggle - Intelligent Manufacturing Dataset (The Intelligent Manufacturing Dataset for Predictive Optimization is a dataset designed for research in smart manufacturing, AI-driven process optimization, and predictive maintenance)

Data Example:

	Timestamp	Machine_ID	Operation_Mode	Temperature_C	Vibration_Hz	Power_Consumption_kW	Network_Latency_ms	Packet_Loss_%	Quality_Control_Defect_Rate_%	Production_Speed_units_per_hr	Predictive_Maintenance_Score	Error_Rate_%	Efficiency_Status
0	2024-01-01 00:00:00	39	Idle	74.137590	3.500595	8.612162	10.650542	0.207764	7.751261	477.657391	0.344650	14.965470	Low
1	2024-01-01 00:01:00	29	Active	84.264558	3.355928	2.268559	29.111810	2.228464	4.989172	398.174747	0.769848	7.678270	Low
2	2024-01-01 00:02:00	15	Active	44.280102	2.079766	6.144105	18.357292	1.639416	0.456816	108.074959	0.987086	8.198391	Low
3	2024-01-01 00:03:00	43	Active	40.568502	0.298238	4.067825	29.153629	1.161021	4.582974	329.579410	0.983390	2.740847	Medium
4	2024-01-01 00:04:00	8	Idle	75.063817	0.345810	6.225737	34.029191	4.796520	2.287716	159.113525	0.573117	12.100686	Low

🗓️Project Schedule

Week 1.

Definition of problem statement and goals
Plan to incorporate peer review feedback into project plan
Data cleaning. (handling missing, outliers, define imputation methods).
Define key response on the datasets, depending on the model (might want to look like defects pass/fail) for a classification model or defect rate for a regression model.

Week 2.

Analyze features (use PCA or others to understand which features contribute more the variability, etc.)
Classification Model Creation and Validation
Regression Model Creation and Validation
Comparing models and recommend the best one.

Week 3:

Assess anomaly detection and recommend future possible next steps.
Prepare final report and presentation

📁Project Organization

| FINAL-PROJECT-CASTRO
| — 📁DATA: # Raw Data files obtained from Kaggle source in CSV format.
| ______|—- 📁processed: # Cleaned and processed datasets
| ______|—- 📁results: # model evaluation and other results
| — 📁IMAGES: # Any images to be used by quarto site
| — 📁presentation_files: # Quarto presentation files
| — 📁extra: # Additional documents or files used on project
| — 📁quarto: # quarto files
| — 📁src: # source code used for project
| — 📁.github: # github configuration files
| – 📄requirements.txt: # Python Dependencies
| – 📄_quarto.yml # quarto metadata and configuration
| – 📄.gitignore # list of files and directories to be ignore by Git
| – 📄about.qmd # Quarto about page with general information about the project
| – 📄presentation.qmd # Quarto final project presentation
| – 📄proposal.qmd # Project Problem statement and proposal
| – 📄README.md # main read me file for git.