Kev-HL commited on
Commit
babf969
·
0 Parent(s):

Minimal clone for deployment, see README for full project

Browse files
.dockerignore ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Dockerignore file for excluding unnecessary files from the Docker build context
2
+ # Byte-compiled / cache
3
+ # Python cache folders
4
+ __pycache__/
5
+ # Compiled Python files (.pyc, .pyo, etc.)
6
+ *.py[cod]
7
+ # Temporary files from Jupyter notebooks
8
+ *.ipynb_checkpoints/
9
+
10
+ # Environment
11
+ # Environment variables files
12
+ .env
13
+ # Catch other .env-like files
14
+ *.env
15
+
16
+ # OS metadata files
17
+ .DS_Store
18
+ Thumbs.db
19
+
20
+ # VSCode
21
+ # Editor-specific settings
22
+ .vscode/
23
+
24
+ # Git
25
+ .git/
26
+ .gitignore
27
+
28
+ # Logs, temp files
29
+ *.log
30
+ logs/
31
+ core
32
+
33
+ # Documentation
34
+ Project_Strategy.md
35
+ # Don't add config files
36
+ /configs/
37
+ # Don't add datasets
38
+ /data/
39
+ # Don't add Python notebooks
40
+ /notebooks/
41
+ # Don't add scripts
42
+ /scripts/
43
+ # Don't add modules
44
+ /src/
45
+ # Don't add trained model files
46
+ /models/
47
+ # Exclude final model file
48
+ !/models/final_model/final_model.tflite
49
+
50
+ # Don't add main repo requirements.txt file
51
+ # The requirements.txt for the Docker image is in app/requirements.txt
52
+ /requirements.txt
.gitattributes ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ # Files required to be stored with Git LFS on Hugging Face Spaces
2
+ # (all files >10MB, or binary files such as images, fonts...)
3
+ *.png filter=lfs diff=lfs merge=lfs -text
4
+ *.tflite filter=lfs diff=lfs merge=lfs -text
5
+ *.ttf filter=lfs diff=lfs merge=lfs -text
6
+ *.whl filter=lfs diff=lfs merge=lfs -text
7
+
.gitignore ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Byte-compiled / cache
2
+ # Python cache folders
3
+ __pycache__/
4
+ # Compiled Python files (.pyc, .pyo, etc.)
5
+ *.py[cod]
6
+ # Temporary files from Jupyter notebooks
7
+ *.ipynb_checkpoints/
8
+ # Other temp files
9
+ core
10
+
11
+ # VSCode
12
+ # Editor-specific settings
13
+ .vscode/
14
+
15
+ # Environment
16
+ # Environment variables files
17
+ .env
18
+ # Catch other .env-like files
19
+ *.env
Dockerfile ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI
2
+ # Use slim Python image for smaller size
3
+ FROM python:3.9.23-slim-bookworm
4
+
5
+ # Basic ownership labels
6
+ LABEL maintainer="Kev-HL (GitHub)"
7
+ LABEL org.opencontainers.image.source="https://github.com/Kev-HL/capsule-defect-segmentation-api"
8
+
9
+ # Set working directory
10
+ WORKDIR /app
11
+
12
+ # Create a non-root user and group (appuser)
13
+ RUN addgroup --system appuser && adduser --system --ingroup appuser appuser
14
+
15
+ # Update system packages and clean up
16
+ RUN apt-get update && apt-get upgrade -y && apt-get clean && rm -rf /var/lib/apt/lists/*
17
+
18
+ # Copy requirements.txt for API dependencies
19
+ COPY app/requirements.txt .
20
+
21
+ # Install Python dependencies
22
+ RUN pip install --no-cache-dir -r requirements.txt
23
+
24
+ # Install TensorFlow Lite runtime from local wheel file
25
+ # Remove or comment if using a different interpreter (tflite-runtime or ai-edge-litert)
26
+ COPY --chown=appuser:appuser app/tflite_runtime-2.19.0-cp39-cp39-linux_x86_64.whl .
27
+ RUN pip install --no-cache-dir ./tflite_runtime-2.19.0-cp39-cp39-linux_x86_64.whl
28
+
29
+ # Clean up
30
+ RUN rm tflite_runtime-2.19.0-cp39-cp39-linux_x86_64.whl && \
31
+ find /usr/local/lib/python3.9/ -type d -name '__pycache__' -prune -exec rm -rf {} + && \
32
+ rm -rf /usr/share/doc /usr/share/man /usr/share/info /usr/share/locale/*
33
+
34
+ # Copy app code (FastAPI app)
35
+ COPY --chown=appuser:appuser app/main.py .
36
+
37
+ # Copy aux code (functions for FastAPI app)
38
+ COPY --chown=appuser:appuser app/aux.py .
39
+
40
+ # Copy model file
41
+ COPY --chown=appuser:appuser models/final_model/final_model.tflite .
42
+
43
+ # Copy HTML templates
44
+ RUN mkdir -p templates && chown -R appuser:appuser templates
45
+ COPY --chown=appuser:appuser app/templates/ templates/
46
+
47
+ # Create static directories for uploads, results and samples
48
+ RUN mkdir -p static/uploads static/results static/samples && chown -R appuser:appuser static
49
+
50
+ # Copy sample images
51
+ COPY --chown=appuser:appuser app/samples/ static/samples/
52
+
53
+ # Copy font file (and license) for text rendering on images
54
+ RUN mkdir -p fonts && chown -R appuser:appuser fonts
55
+ COPY --chown=appuser:appuser app/fonts/OpenSans-Bold.ttf fonts
56
+ COPY --chown=appuser:appuser app/fonts/OFL.txt fonts
57
+
58
+ # Set permissions for static files
59
+ RUN chmod -R 777 static
60
+
61
+ # Switch to non-root user
62
+ USER appuser
63
+
64
+ # Expose port (FastAPI default)
65
+ EXPOSE 8000
66
+
67
+ # Set environment variables
68
+ # Disable buffering for easier logging (immediate output)
69
+ ENV PYTHONUNBUFFERED=1
70
+
71
+ # Start FastAPI app with uvicorn
72
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
README.md ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ > **Note:** This repo contains only deployment/demo files.
2
+ > For full source, notebooks, and complete code, see [Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI](https://github.com/Kev-HL/capsule-defect-segmentation-api).
3
+
4
+ # Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI
5
+
6
+ This project addresses a real-world computer vision challenge: detecting and localizing defects on medicinal capsules via image classification and segmentation.
7
+ The aim is to deliver a complete pipeline—data preprocessing, model training and evaluation, and deployment, demonstrating practical ML engineering from scratch to API.
8
+
9
+ ---
10
+
11
+ ## Main Repo
12
+
13
+ This is a minimal clone with only the necessary files from the main repo.
14
+ For full source, notebooks, and complete code, see [Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI](https://github.com/Kev-HL/capsule-defect-segmentation-api).
15
+
16
+ ---
17
+
18
+ ## Project Overview
19
+
20
+ End-to-end defect detection and localization using the **Capsule** class from the **MVTec AD dataset**.
21
+ Key steps include:
22
+ - Data preprocessing, formatting, and augmentation
23
+ - Model design (pre-trained backbone + custom heads)
24
+ - Training, evaluation, and hyperparameter tuning
25
+ - Dockerized FastAPI deployment for inference
26
+
27
+ *Portfolio project to showcase ML workflow and engineering.*
28
+
29
+ ---
30
+
31
+ ## Key Results
32
+
33
+ - Evaluation dataset: MVTec AD 'capsule' class, 70/15/15 train/val/test split
34
+ - Quantitative results on test evaluation:
35
+ - Classification accuracy: **83 %**
36
+ - Classification defect-only accuracy: **75 %**
37
+ - Defect presence accuracy: **91 %**
38
+ - Segmentation quality (mIoU / Dice): **0.79 / 0.73**
39
+ - Segmentation defect-only quality (mIoU / Dice): **0.70 / 0.55**
40
+ - Model artifacts:
41
+ - Original model size (.keras / SavedModel): **345 MB**
42
+ - Raw Converted TFLite size (.tflite): **119 MB**
43
+ - Optimized Converted TFLite size (.tflite): **31 MB** (Dynamic Range Quantization applied)
44
+ - Container / runtime:
45
+ - Docker image size: **317 MB**
46
+ - Runtime used: **tflite-runtime + Uvicorn/FastAPI**
47
+ - Avg inference latency (inference only, set tensor + invoke): **239 ms**
48
+ - Avg inference latency (single POST request, measured): **271 ms**
49
+ - Average memory usage during inference: **321 MB**
50
+ - Startup time (local): **72 ms**
51
+ - Observations:
52
+ - The app returns expected visualizations and class labels for the MVTec-style test images.
53
+ - POST inference latency measured locally, expect increased latency on real use (network delays)
54
+ - Given the small and highly imbalanced dataset (351 samples, 242 'good' and 109 defective distributed in 5 defect types, ~22 per defect), coupled with the nature of the samples (only distinctive feature is the defect, which in most cases has a small size and varied shape), performance is not as strong as desired, and results lack statistical confidence for a real-case use. Without more data would be difficult to get a reasonable improvement.
55
+
56
+ ---
57
+
58
+ ## Dataset
59
+
60
+ - *Capsule* class from [MVTec AD dataset](https://www.mvtec.com/company/research/datasets/mvtec-ad)
61
+ - License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
62
+ - Dataset folder contains license file
63
+ - Usage is strictly non-commercial/educational
64
+
65
+ ---
66
+
67
+ ## Tech Stack
68
+
69
+ - Python
70
+ - TensorFlow
71
+ - Scikit-Learn
72
+ - Numpy / Pandas
73
+ - OpenCV / Pillow
74
+ - Ray Tune (Experiment tracking)
75
+ - OmegaConf (Config management)
76
+ - Docker, FastAPI, Uvicorn (Deployment)
77
+
78
+ ---
79
+
80
+ ## Folder Structure
81
+
82
+ ```
83
+ data/ # Dataset and annotations
84
+ app/ # Inference and deployment code and files
85
+ models/ # Saved trained models and training logs
86
+ ```
87
+
88
+ ---
89
+
90
+ ## How to Run
91
+
92
+ **Build image for deployment:**
93
+ - Requirements:
94
+ - `models/final_model/final_model.tflite` (included)
95
+ - `app/` folder and contents (included)
96
+ - `Dockerfile` (included)
97
+ - `.dockerignore` (included)
98
+ - From the project root, build and run the Docker image:
99
+ ```sh
100
+ docker build -t cv-app .
101
+ docker run -p 8000:8000 cv-app
102
+ ```
103
+ - Open http://0.0.0.0:8000 in your browser to access the demo UI
104
+
105
+ _Note: For the full source code and steps on how to recreate the model, visit the full repo (see "Main Repo" section near the top)_
106
+
107
+ ---
108
+
109
+ ## Contact
110
+
111
+ For questions reach out via GitHub (Kev-HL).
app/aux.py ADDED
@@ -0,0 +1,215 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module with auxiliary functions for the FastAPI defect detection demo app.
3
+
4
+ Provided functions:
5
+ - preprocess_image: Preprocess input image for model inference.
6
+ - inference: Perform model inference and return class ID, class name, and segmentation mask.
7
+ - encode_mask_to_base64: Encode segmentation mask to base64 string.
8
+ - save_image: Save uploaded image bytes to a file.
9
+ - draw_prediction: Save image with overlayed prediction mask and class name.
10
+
11
+ Design notes
12
+
13
+ - This application is a demonstration / portfolio app. For simplicity and safety during demo runs, inference is performed synchronously using a single global TFLite Interpreter instance protected by a threading.Lock to ensure thread-safety.
14
+ - The code intentionally makes a number of fixed assumptions about the model and runtime. If the model or deployment requirements change, the corresponding preprocessing, postprocessing and runtime setup should be updated and tested.
15
+
16
+ Assumptions:
17
+
18
+ File system and assets
19
+ Font used for drawing labels: ./fonts/OpenSans-Bold.ttf
20
+ Static files served from: ./static
21
+ Directories (./static/uploads, ./static/results, ./static/samples) are expected to be present/created by the deployment (Dockerfile or startup); added an exist_ok mkdir as safeguard.
22
+
23
+ Upload / input constraints
24
+ Uploaded images are expected to be valid PNG images (this matches the local MVTec AD dataset used for development).
25
+ Maximum accepted upload size: 5 MB.
26
+
27
+ Runtime / model
28
+ Uses tflite-runtime Interpreter for model inference (Interpreter from tflite_runtime.interpreter).
29
+ TFLite model file path: ./final_model.tflite
30
+ Single Interpreter instance is created at startup and reused for all requests (protected by a threading.Lock).
31
+
32
+ Model I/O (these are the exact assumptions used by the code)
33
+ Expected input tensor: shape (1, 512, 512, 3), dtype float32, pixel value range [0, 255] (model handles internally normalization to [0, 1]).
34
+ Expected output[0]: segmentation mask of shape (1, 512, 512, 1), dtype float32, values in [0, 1] (probability map).
35
+ Expected output[1]: class probabilities of shape (1, 6), dtype float32 (softmax-like probabilities).
36
+ """
37
+
38
+ # IMPORTS
39
+
40
+ # Standard library imports
41
+ import base64
42
+ import io
43
+ import logging
44
+ import os
45
+ import threading
46
+ import time
47
+ import uuid
48
+
49
+ # Third-party imports
50
+ import numpy as np
51
+ from PIL import Image, ImageDraw, ImageFont
52
+
53
+ # CONFIGURATION AND CONSTANTS
54
+
55
+ # Font path for drawing text on images
56
+ FONT_PATH = './fonts/OpenSans-Bold.ttf'
57
+
58
+ # Input image size for the model
59
+ INPUT_IMAGE_SIZE = (512, 512)
60
+
61
+ # Transparency level and color for mask overlay
62
+ MAX_ALPHA = 100 # [0-255]
63
+ MASK_COLOR = (0, 255, 255, 0) # Cyan RGB (R,G,B,A)
64
+
65
+ # Dictionary mapping class IDs to class names
66
+ CLASS_MAP = {
67
+ 0: 'good',
68
+ 1: 'crack',
69
+ 2: 'faulty_imprint',
70
+ 3: 'poke',
71
+ 4: 'scratch',
72
+ 5: 'squeeze'
73
+ }
74
+
75
+ # AUXILIARY FUNCTIONS FOR main.py
76
+
77
+ # Function to preprocess the image
78
+ def preprocess_image(image_bytes) -> np.ndarray:
79
+ """
80
+ Preprocess the input image for model inference.
81
+ Args:
82
+ image_bytes: Raw bytes of the input image.
83
+ Returns:
84
+ Preprocessed image as a numpy array of shape (1, INPUT_IMAGE_SIZE[0], INPUT_IMAGE_SIZE[1], 3) and dtype float32.
85
+ """
86
+ image = Image.open(io.BytesIO(image_bytes)).convert('RGB')
87
+ image = image.resize(INPUT_IMAGE_SIZE)
88
+ img_array = np.array(image, dtype=np.float32)
89
+ img_array = np.expand_dims(img_array, axis=0)
90
+ return img_array
91
+
92
+ # Function to perform inference on a preprocessed image
93
+ def inference(img, inference_ctx) -> tuple[int, str, np.ndarray]:
94
+ """
95
+ Perform model inference on the preprocessed image.
96
+ Args:
97
+ img: Preprocessed image as a numpy array.
98
+ inference_ctx: Dictionary containing the threading lock and the interpreter and its details.
99
+ Returns:
100
+ Tuple containing:
101
+ - class_id: Predicted class ID (int).
102
+ - class_name: Predicted class name (str).
103
+ - mask: Predicted segmentation mask as a numpy array.
104
+ """
105
+ # Ensure the interpreter is thread-safe
106
+ with inference_ctx['interpreter_lock']:
107
+
108
+ # Set the input tensor and invoke the interpreter
109
+ inference_ctx['interpreter'].set_tensor(inference_ctx['input_details'][0]['index'], img)
110
+ inference_ctx['interpreter'].invoke()
111
+ # Get the prediction results
112
+ pred_mask = inference_ctx['interpreter'].get_tensor(inference_ctx['output_details'][0]['index'])
113
+ pred_label_probs = inference_ctx['interpreter'].get_tensor(inference_ctx['output_details'][1]['index'])
114
+
115
+ # Format the prediction results and get the class name
116
+ pred_label = np.argmax(pred_label_probs, axis=1)
117
+ class_id = int(pred_label[0])
118
+ class_name = CLASS_MAP.get(class_id, 'unknown')
119
+ mask = pred_mask.squeeze()
120
+
121
+ return class_id, class_name, mask
122
+
123
+ # Function to encode mask to base64
124
+ def encode_mask_to_base64(mask_array) -> str:
125
+ """
126
+ Encode the segmentation mask to a base64 string.
127
+ Args:
128
+ mask_array: Segmentation mask as a numpy array.
129
+ Returns:
130
+ Base64-encoded string of the mask image.
131
+ """
132
+ mask = (mask_array * 255).astype(np.uint8)
133
+ mask_img = Image.fromarray(mask, mode='L')
134
+ buffer = io.BytesIO()
135
+ mask_img.save(buffer, format='PNG')
136
+ mask64 = base64.b64encode(buffer.getvalue()).decode('utf-8')
137
+ return mask64
138
+
139
+ # Function to save an image for later use
140
+ def save_image(image_bytes) -> tuple[str, str]:
141
+ """
142
+ Save the uploaded image bytes to a file.
143
+ Args:
144
+ image_bytes: Raw bytes of the input image.
145
+ Returns:
146
+ Tuple containing:
147
+ - filename: Name of the saved file (str).
148
+ - path: Path to the saved file (str).
149
+ """
150
+ filename = f'{uuid.uuid4().hex}.png'
151
+ os.makedirs('static/uploads', exist_ok=True)
152
+ path = f'static/uploads/{filename}'
153
+ with open(path, 'wb') as f:
154
+ f.write(image_bytes)
155
+ return filename, path
156
+
157
+ # Function to save image with overlayed prediction mask and class name
158
+ def draw_prediction(image_path, mask_array, class_name) -> tuple[str, str]:
159
+ """
160
+ Save image with overlayed prediction mask and class name.
161
+ Args:
162
+ image_path: Path to the original image file.
163
+ mask_array: Segmentation mask as a numpy array.
164
+ class_name: Predicted class name (str).
165
+ Returns:
166
+ Tuple containing:
167
+ - filename: Name of the saved file (str).
168
+ - path: Path to the saved file (str).
169
+ """
170
+ # Load the original image and mask
171
+ orig_img = Image.open(image_path).convert('RGB')
172
+ mask = (mask_array * 255).astype(np.uint8)
173
+ mask_img = Image.fromarray(mask, mode='L')
174
+ if mask_img.size != orig_img.size:
175
+ mask_img = mask_img.resize(orig_img.size, resample=Image.Resampling.BILINEAR)
176
+
177
+ # Overlay the mask on the original image with some transparency
178
+ alpha_arr = (np.array(mask_img, dtype=np.float32) / 255.0 * float(MAX_ALPHA)).astype(np.uint8)
179
+ alpha_img = Image.fromarray(alpha_arr, mode='L')
180
+ overlay = Image.new('RGBA', orig_img.size, MASK_COLOR)
181
+ overlay.putalpha(alpha_img)
182
+ overlay_img = Image.alpha_composite(orig_img.convert('RGBA'), overlay).convert('RGB')
183
+
184
+ # Draw the class name on the image
185
+ draw = ImageDraw.Draw(overlay_img)
186
+ try:
187
+ font = ImageFont.truetype(FONT_PATH, 35)
188
+ except:
189
+ font = ImageFont.load_default()
190
+ draw.text((40, 40), class_name, fill='red', font=font)
191
+
192
+ # Save the visualization image (with bounding box and label)
193
+ filename = f'{uuid.uuid4().hex}.png'
194
+ os.makedirs('static/results', exist_ok=True)
195
+ path = f'static/results/{filename}'
196
+ overlay_img.save(path)
197
+
198
+ return filename, path
199
+
200
+ # Function to delete files after a delay
201
+ def delete_files_later(files, delay=10) -> None:
202
+ """
203
+ Delete files after a specified delay.
204
+ Args:
205
+ files: List of file paths to delete.
206
+ delay: Time in seconds to wait before deleting files (default is 10).
207
+ """
208
+ def _del_files():
209
+ time.sleep(delay)
210
+ for f in files:
211
+ try: os.remove(f)
212
+ except: logging.exception('Error deleting file %s', f)
213
+ t = threading.Thread(target=_del_files, daemon=True)
214
+ t.start()
215
+
app/fonts/OFL.txt ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright 2020 The Open Sans Project Authors (https://github.com/googlefonts/opensans)
2
+
3
+ -----------------------------------------------------------
4
+ SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
5
+ -----------------------------------------------------------
6
+
7
+ PREAMBLE
8
+ The goals of the Open Font License (OFL) are to stimulate worldwide
9
+ development of collaborative font projects, to support the font
10
+ creation efforts of academic and linguistic communities, and to
11
+ provide a free and open framework in which fonts may be shared and
12
+ improved in partnership with others.
13
+
14
+ The OFL allows the licensed fonts to be used, studied, modified and
15
+ redistributed freely as long as they are not sold by themselves. The
16
+ fonts, including any derivative works, can be bundled, embedded,
17
+ redistributed and/or sold with any software provided that any reserved
18
+ names are not used by derivative works. The fonts and derivatives,
19
+ however, cannot be released under any other type of license. The
20
+ requirement for fonts to remain under this license does not apply to
21
+ any document created using the fonts or their derivatives.
22
+
23
+ DEFINITIONS
24
+ "Font Software" refers to the set of files released by the Copyright
25
+ Holder(s) under this license and clearly marked as such. This may
26
+ include source files, build scripts and documentation.
27
+
28
+ "Reserved Font Name" refers to any names specified as such after the
29
+ copyright statement(s).
30
+
31
+ "Original Version" refers to the collection of Font Software
32
+ components as distributed by the Copyright Holder(s).
33
+
34
+ "Modified Version" refers to any derivative made by adding to,
35
+ deleting, or substituting -- in part or in whole -- any of the
36
+ components of the Original Version, by changing formats or by porting
37
+ the Font Software to a new environment.
38
+
39
+ "Author" refers to any designer, engineer, programmer, technical
40
+ writer or other person who contributed to the Font Software.
41
+
42
+ PERMISSION & CONDITIONS
43
+ Permission is hereby granted, free of charge, to any person obtaining
44
+ a copy of the Font Software, to use, study, copy, merge, embed,
45
+ modify, redistribute, and sell modified and unmodified copies of the
46
+ Font Software, subject to the following conditions:
47
+
48
+ 1) Neither the Font Software nor any of its individual components, in
49
+ Original or Modified Versions, may be sold by itself.
50
+
51
+ 2) Original or Modified Versions of the Font Software may be bundled,
52
+ redistributed and/or sold with any software, provided that each copy
53
+ contains the above copyright notice and this license. These can be
54
+ included either as stand-alone text files, human-readable headers or
55
+ in the appropriate machine-readable metadata fields within text or
56
+ binary files as long as those fields can be easily viewed by the user.
57
+
58
+ 3) No Modified Version of the Font Software may use the Reserved Font
59
+ Name(s) unless explicit written permission is granted by the
60
+ corresponding Copyright Holder. This restriction only applies to the
61
+ primary font name as presented to the users.
62
+
63
+ 4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
64
+ Software shall not be used to promote, endorse or advertise any
65
+ Modified Version, except to acknowledge the contribution(s) of the
66
+ Copyright Holder(s) and the Author(s) or with their explicit written
67
+ permission.
68
+
69
+ 5) The Font Software, modified or unmodified, in part or in whole,
70
+ must be distributed entirely under this license, and must not be
71
+ distributed under any other license. The requirement for fonts to
72
+ remain under this license does not apply to any document created using
73
+ the Font Software.
74
+
75
+ TERMINATION
76
+ This license becomes null and void if any of the above conditions are
77
+ not met.
78
+
79
+ DISCLAIMER
80
+ THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
81
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
82
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
83
+ OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE
84
+ COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
85
+ INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
86
+ DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
87
+ FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
88
+ OTHER DEALINGS IN THE FONT SOFTWARE.
app/fonts/OpenSans-Bold.ttf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:27da758f4dcac9a65abe914c13b463b42982b9909bc65713424099f4810bd1e6
3
+ size 147264
app/main.py ADDED
@@ -0,0 +1,301 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI app for defect detection using a TFLite model.
3
+
4
+ Provided endpoints
5
+
6
+ - GET / : Render an HTML form (no results).
7
+ - POST /predict/ : REST API. Predict defect on an uploaded image; returns JSON.
8
+ - POST /upload/ : Upload image, run prediction, and return an HTML page with visualization and results.
9
+ - POST /random-sample/ : Run prediction on a random sample image and return an HTML page with visualization and results.
10
+
11
+ Design notes
12
+
13
+ - This application is a demonstration / portfolio app. For simplicity and safety during demo runs, inference is performed synchronously using a single global TFLite Interpreter instance protected by a threading.Lock to ensure thread-safety.
14
+ - The code intentionally makes a number of fixed assumptions about the model and runtime. If the model or deployment requirements change, the corresponding preprocessing, postprocessing and runtime setup should be updated and tested.
15
+
16
+ Assumptions:
17
+
18
+ File system and assets
19
+ Font used for drawing labels: ./fonts/OpenSans-Bold.ttf
20
+ Static files served from: ./static
21
+ Directories (./static/uploads, ./static/results, ./static/samples) are expected to be present/created by the deployment (Dockerfile or startup); added an exist_ok mkdir as safeguard.
22
+
23
+ Upload / input constraints
24
+ Uploaded images are expected to be valid PNG images (this matches the local MVTec AD dataset used for development).
25
+ Maximum accepted upload size: 5 MB.
26
+
27
+ Runtime / model
28
+ Uses tflite-runtime Interpreter for model inference (Interpreter from tflite_runtime.interpreter).
29
+ TFLite model file path: ./final_model.tflite
30
+ Single Interpreter instance is created at startup and reused for all requests (protected by a threading.Lock).
31
+
32
+ Model I/O (these are the exact assumptions used by the code)
33
+ Expected input tensor: shape (1, 512, 512, 3), dtype float32, pixel value range [0, 255] (model handles internally normalization to [0, 1]).
34
+ Expected output[0]: segmentation mask of shape (1, 512, 512, 1), dtype float32, values in [0, 1] (probability map).
35
+ Expected output[1]: class probabilities of shape (1, 6), dtype float32 (softmax-like probabilities).
36
+ """
37
+
38
+ # IMPORTS
39
+
40
+ # Standard library imports
41
+ import io
42
+ import logging
43
+ import os
44
+ import random
45
+ import time
46
+ import threading
47
+
48
+ # Third-party imports
49
+ from fastapi import FastAPI, File, UploadFile, Request, BackgroundTasks
50
+ from fastapi.responses import JSONResponse, HTMLResponse
51
+ from fastapi.templating import Jinja2Templates
52
+ from fastapi.staticfiles import StaticFiles
53
+ from PIL import Image, UnidentifiedImageError
54
+ from tflite_runtime.interpreter import Interpreter
55
+ # from ai_edge_litert.interpreter import Interpreter
56
+
57
+ # Auxiliary imports (Dockerfile sets CWD to /app)
58
+ from aux import preprocess_image, inference, save_image, draw_prediction, encode_mask_to_base64, delete_files_later
59
+
60
+ # START TIME LOGGING
61
+ import time
62
+ app_start = time.perf_counter()
63
+
64
+ # CONFIGURATION AND CONSTANTS
65
+
66
+ # Path to TFLite model file
67
+ MODEL_PATH = './final_model.tflite'
68
+
69
+ # Number of threads for TFLite interpreter
70
+ NUM_THREADS = 4
71
+
72
+ # Jinja2 templates directory
73
+ TEMPLATES = Jinja2Templates(directory='templates')
74
+
75
+ # Max file size for uploads (5 MB)
76
+ MAX_FILE_SIZE = 5 * 1024 * 1024 # 5 MB
77
+
78
+
79
+ # MAIN APPLICATION
80
+
81
+
82
+ # Set up logging to show INFO level and above messages
83
+ logging.basicConfig(level=logging.INFO)
84
+
85
+ # Initialize FastAPI app
86
+ app = FastAPI()
87
+
88
+ # Mount static files directory for serving images and other assets
89
+ # App will raise errors if folders do not exist
90
+ # Directory creation is handled by the Dockerfile
91
+ os.makedirs('static', exist_ok=True)
92
+ app.mount('/static', StaticFiles(directory='static'), name='static')
93
+
94
+ # Load model, set up interpreter and get input/output details
95
+ try:
96
+ interpreter = Interpreter(model_path=MODEL_PATH, num_threads=NUM_THREADS)
97
+ except:
98
+ logging.warning(f'num_threads={NUM_THREADS} not supported, falling back to single-threaded interpreter.')
99
+ interpreter = Interpreter(model_path=MODEL_PATH)
100
+ interpreter.allocate_tensors()
101
+ input_details = interpreter.get_input_details()
102
+ output_details = interpreter.get_output_details()
103
+ logging.info('TF Lite input details: %s \n', input_details)
104
+ logging.info('TF Lite output details: %s \n', output_details)
105
+
106
+ # Create a threading lock for the interpreter to ensure thread-safety
107
+ interpreter_lock = threading.Lock()
108
+
109
+ # Inference context to be passed to inference function
110
+ inference_ctx = {
111
+ 'interpreter_lock': interpreter_lock,
112
+ 'interpreter': interpreter,
113
+ 'input_details': input_details,
114
+ 'output_details': output_details,
115
+ }
116
+
117
+ # Startup time measurement
118
+ @app.on_event('startup')
119
+ async def report_startup_time():
120
+ startup_time = (time.perf_counter() - app_start) * 1000 # in milliseconds
121
+ logging.info(f'App startup time: {startup_time:.2f} ms \n')
122
+
123
+ # Root endpoint to render the HTML form
124
+ @app.get('/', response_class=HTMLResponse)
125
+ async def root(request: Request):
126
+ # Render the HTML form with empty image URLs and no result
127
+ return TEMPLATES.TemplateResponse(
128
+ 'index.html',
129
+ {
130
+ 'request': request,
131
+ 'result': None,
132
+ 'orig_img_url': None,
133
+ 'vis_img_url': None,
134
+ }
135
+ )
136
+
137
+ # Endpoint to handle image prediction (API)
138
+ @app.post('/predict/')
139
+ async def predict(file: UploadFile = File(...)):
140
+ try:
141
+ # Check if the uploaded file is a PNG image
142
+ if file.content_type != 'image/png':
143
+ return JSONResponse(status_code=400, content={'error': 'Only PNG images are supported.'})
144
+
145
+ # Read the image
146
+ image_bytes = await file.read()
147
+
148
+ # Check if the file size exceeds the maximum limit
149
+ if len(image_bytes) > MAX_FILE_SIZE:
150
+ return JSONResponse(status_code=400, content={'error': 'File size exceeds the maximum limit of 5 MB.'})
151
+
152
+ # Check if the image is a valid PNG (not just a file with .png extension)
153
+ try:
154
+ img_check = Image.open(io.BytesIO(image_bytes))
155
+ if img_check.format != 'PNG':
156
+ raise ValueError('Not a PNG')
157
+ except (UnidentifiedImageError, ValueError):
158
+ return JSONResponse(status_code=400, content={'error': 'Invalid image file.'})
159
+
160
+ # Preprocess the image
161
+ img = preprocess_image(image_bytes)
162
+
163
+ # Run inference on the preprocessed image
164
+ class_id, class_name, mask = inference(img, inference_ctx)
165
+
166
+ # Encode mask to base64
167
+ mask64 = encode_mask_to_base64(mask)
168
+
169
+ # Return the prediction results as JSON
170
+ return {
171
+ 'class_id': class_id,
172
+ 'class_name': class_name,
173
+ 'mask64_PNG_L': mask64,
174
+ }
175
+ except Exception as e:
176
+ logging.exception(f'Error during prediction: {e}')
177
+ return JSONResponse(status_code=500, content={'error': 'Model inference failed.'})
178
+
179
+ # Endpoint to handle image upload and prediction with visualization
180
+ @app.post('/upload/', response_class=HTMLResponse)
181
+ async def upload(
182
+ request: Request,
183
+ file: UploadFile = File(...),
184
+ background_tasks: BackgroundTasks = None
185
+ ):
186
+ try:
187
+ # Check if the uploaded file is a PNG image
188
+ if file.content_type != 'image/png':
189
+ result = {'error': 'Only PNG images are supported.'}
190
+ return TEMPLATES.TemplateResponse('index.html', {'request': request, 'result': result})
191
+
192
+ # Read the uploaded image
193
+ image_bytes = await file.read()
194
+
195
+ # Check if the file size exceeds the maximum limit
196
+ if len(image_bytes) > MAX_FILE_SIZE:
197
+ return TEMPLATES.TemplateResponse('index.html', {'request': request, 'result': {'error': 'File too large (max 5MB).'}})
198
+
199
+ # Check if the image is a valid PNG (not just a file with .png extension)
200
+ try:
201
+ img_check = Image.open(io.BytesIO(image_bytes))
202
+ if img_check.format != 'PNG':
203
+ raise ValueError('Not a PNG')
204
+ except (UnidentifiedImageError, ValueError):
205
+ return TEMPLATES.TemplateResponse('index.html', {'request': request, 'result': {'error': 'Invalid image file.'}})
206
+
207
+ # Save the preprocessed image
208
+ preproc_filename, preproc_path = save_image(image_bytes)
209
+
210
+ # Preprocess the image
211
+ img = preprocess_image(image_bytes)
212
+
213
+ # Run inference on the preprocessed image
214
+ class_id, class_name, mask = inference(img, inference_ctx)
215
+
216
+ # Overlay mask and draw class name on the preprocessed image for display
217
+ pred_filename, pred_path = draw_prediction(preproc_path, mask, class_name)
218
+
219
+ # Encode mask to base64
220
+ mask64 = encode_mask_to_base64(mask)
221
+
222
+ # Prepare the result to be displayed in the HTML template
223
+ result = {
224
+ 'class_id': class_id,
225
+ 'class_name': class_name,
226
+ 'mask64_PNG_L': mask64,
227
+ }
228
+
229
+ # Schedule deletion of both images after 10 seconds
230
+ if background_tasks is not None:
231
+ background_tasks.add_task(delete_files_later, [preproc_path, pred_path], delay=10)
232
+
233
+ # Render the HTML template with the result and image URLs
234
+ return TEMPLATES.TemplateResponse(
235
+ 'index.html',
236
+ {
237
+ 'request': request,
238
+ 'result': result,
239
+ 'preproc_img_url': f'/static/uploads/{preproc_filename}',
240
+ 'pred_img_url': f'/static/results/{pred_filename}',
241
+ }
242
+ )
243
+ except Exception as e:
244
+ logging.exception(f'Error during prediction: {e}')
245
+ return TEMPLATES.TemplateResponse('index.html', {'request': request, 'result': {'error': 'Model inference failed.'}})
246
+
247
+ # Endpoint to handle random image (from samples) prediction with visualization
248
+ @app.post('/random-sample/', response_class=HTMLResponse)
249
+ async def random_sample(request: Request, background_tasks: BackgroundTasks = None):
250
+ try:
251
+ # Check if the samples directory exists and contains PNG files
252
+ samples_dir = 'static/samples'
253
+ sample_files = [f for f in os.listdir(samples_dir) if f.lower().endswith('.png')]
254
+ if not sample_files:
255
+ result = {'error': 'No sample images available.'}
256
+ return TEMPLATES.TemplateResponse('index.html', {'request': request, 'result': result})
257
+
258
+ # Randomly select a sample image and read it
259
+ chosen_file = random.choice(sample_files)
260
+ with open(os.path.join(samples_dir, chosen_file), 'rb') as f:
261
+ image_bytes = f.read()
262
+
263
+ # Save preprocessed image
264
+ preproc_filename, preproc_path = save_image(image_bytes)
265
+
266
+ # Preprocess the image
267
+ img = preprocess_image(image_bytes)
268
+
269
+ # Run inference on the preprocessed image
270
+ class_id, class_name, mask = inference(img, inference_ctx)
271
+
272
+ # Overlay mask and draw class name on the preprocessed image for display
273
+ pred_filename, pred_path = draw_prediction(preproc_path, mask, class_name)
274
+
275
+ # Encode mask to base64
276
+ mask64 = encode_mask_to_base64(mask)
277
+
278
+ # Prepare the result to be displayed in the HTML template
279
+ result = {
280
+ 'class_id': class_id,
281
+ 'class_name': class_name,
282
+ 'mask64_PNG_L': mask64,
283
+ }
284
+
285
+ # Schedule deletion of both images after 10 seconds
286
+ if background_tasks is not None:
287
+ background_tasks.add_task(delete_files_later, [preproc_path, pred_path], delay=10)
288
+
289
+ # Render the HTML template with the result and image URLs
290
+ return TEMPLATES.TemplateResponse(
291
+ 'index.html',
292
+ {
293
+ 'request': request,
294
+ 'result': result,
295
+ 'preproc_img_url': f'/static/uploads/{preproc_filename}',
296
+ 'pred_img_url': f'/static/results/{pred_filename}',
297
+ }
298
+ )
299
+ except Exception as e:
300
+ logging.exception(f'Error during prediction: {e}')
301
+ return TEMPLATES.TemplateResponse('index.html', {'request': request, 'result': {'error': 'Model inference failed.'}})
app/requirements.txt ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Requirements for the application
2
+ # Python version used in the container: 3.9.23
3
+ fastapi==0.121.3
4
+ jinja2==3.1.6
5
+ numpy==1.26.4
6
+ pillow==11.3.0
7
+ python-multipart==0.0.20
8
+ uvicorn==0.38.0
9
+
10
+ # Additional requirements for testing
11
+ # Uncomment if needed
12
+ # measure_inference.py requires psutil for memory sampling
13
+ # measure_http.py requires requests for HTTP requests
14
+ # psutil==7.1.3
15
+ # requests==2.32.5
16
+
17
+ # Model interpreter for the TFLite model is installed in the Dockerfile (manually compiled from LiteRT repo)
18
+ # https://github.com/google-ai-edge/LiteRT
19
+ # Commit used for the provided wheel: cc245c70a9113041467a4add21be6d1553b8d831
20
+ # If replicating the environment without the provided wheel, install one of the following:
21
+ # And remove/comment the interpreter installation line in the Dockerfile
22
+ #
23
+ # - full tensorflow (includes tflite interpreter till TF 2.20)
24
+ # USAGE: from tensorflow.lite.python.interpreter import Interpreter (for TF 2.20.0, other versions may differ)
25
+ # tensorflow==2.20.0
26
+ #
27
+ # - tflite-runtime if trained/converted with older TF versions, smaller package but not compatible with recent op versions (deprecated package)
28
+ # USAGE: from tflite_runtime.interpreter import Interpreter
29
+ # tflite-runtime==2.14.0
30
+ #
31
+ # - ai-edge-litert for the latest TFLite interpreter with extended op support, but larger package size
32
+ # USAGE: from ai_edge_litert.interpreter import Interpreter
33
+ # ai-edge-litert==2.0.3
app/samples/sample_0.png ADDED

Git LFS Details

  • SHA256: d63295f7073046911ab3d84d5d38ca12adbb0de33951a05d03578da9e0163904
  • Pointer size: 132 Bytes
  • Size of remote file: 1.17 MB
app/samples/sample_1.png ADDED

Git LFS Details

  • SHA256: ed0ebdbea5764d2031057aedbd4e0152cfbbc250d222fb40acf58d0c3f13be91
  • Pointer size: 132 Bytes
  • Size of remote file: 1.17 MB
app/samples/sample_10.png ADDED

Git LFS Details

  • SHA256: 28e839bfc1dcc9c8c811d0f3817fb7c1184147883f5fefbb7b6fef787e69dc9a
  • Pointer size: 132 Bytes
  • Size of remote file: 1.16 MB
app/samples/sample_11.png ADDED

Git LFS Details

  • SHA256: 3e60541e78494f3a3d410859f742ea3aa3b9d050a86e1eba967c616d46b6643e
  • Pointer size: 132 Bytes
  • Size of remote file: 1.18 MB
app/samples/sample_2.png ADDED

Git LFS Details

  • SHA256: e43e63a51d649a3fb162ed6dfc4fffe7e4903c1eb647d0702f63dfebb413d819
  • Pointer size: 132 Bytes
  • Size of remote file: 1.15 MB
app/samples/sample_3.png ADDED

Git LFS Details

  • SHA256: 07685635e898f3ed4e7ffc8c67cac6f683e43c9988af19b14e4da4ae3c36fd55
  • Pointer size: 132 Bytes
  • Size of remote file: 1.15 MB
app/samples/sample_4.png ADDED

Git LFS Details

  • SHA256: 04ba8c639ea47eead9e9e19e439aff38eafb735efd49e35f6f84f05791d3cd02
  • Pointer size: 132 Bytes
  • Size of remote file: 1.17 MB
app/samples/sample_5.png ADDED

Git LFS Details

  • SHA256: a1ead780817351c8eab7c1a8e3bb8578bad97b2c94ad6b0eae126b0bd18e3cd7
  • Pointer size: 132 Bytes
  • Size of remote file: 1.18 MB
app/samples/sample_6.png ADDED

Git LFS Details

  • SHA256: 3ced8e481d7d7cc07f5f885dc7db4b5e10e3017ad193f33194f3856975d2f5b6
  • Pointer size: 132 Bytes
  • Size of remote file: 1.16 MB
app/samples/sample_7.png ADDED

Git LFS Details

  • SHA256: e39b8b40ca516ceb5ce03ed8b77fda4634ed5c98b29f3939769dd50edb3f66d3
  • Pointer size: 132 Bytes
  • Size of remote file: 1.14 MB
app/samples/sample_8.png ADDED

Git LFS Details

  • SHA256: f2affe78c2d06202819520c1d74a13cd4ff2467d5c6eaa52f073c2b0644f449f
  • Pointer size: 132 Bytes
  • Size of remote file: 1.17 MB
app/samples/sample_9.png ADDED

Git LFS Details

  • SHA256: 468c60496275b35ba5e6531bce1ff4ffdf45b50eea5166333d96cedd27aff9c2
  • Pointer size: 132 Bytes
  • Size of remote file: 1.14 MB
app/templates/index.html ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <title>[DEMO] ML API</title>
5
+ <style>
6
+ body { font-family: Arial, sans-serif; margin: 2em; }
7
+ form { margin-bottom: 1em; }
8
+ #result { white-space: pre-wrap; background: #f5f5f5; padding: 1em; border-radius: 5px; }
9
+ </style>
10
+ </head>
11
+ <body>
12
+ <h1>[DEMO] Defect Classification & Localization on MVTec AD Capsule Dataset</h1>
13
+
14
+ <!-- This form allows users to upload a PNG image for defect detection -->
15
+ <form action="/upload/" method="post" enctype="multipart/form-data">
16
+ <label for="file">Upload PNG image:</label>
17
+ <input type="file" id="file" name="file" accept="image/png" required>
18
+ <button type="submit">Predict</button>
19
+ </form>
20
+
21
+ <!-- This button allows users to try a random sample image using the following script-->
22
+ <button id="randomBtn">Try a random sample image</button>
23
+ <script>
24
+ document.getElementById("randomBtn").onclick = function() {
25
+ // Submit POST to /random-sample/ without any file
26
+ fetch("/random-sample/", {method: "POST"})
27
+ .then(response => response.text())
28
+ .then(html => {
29
+ document.open();
30
+ document.write(html);
31
+ document.close();
32
+ });
33
+ return false;
34
+ };
35
+ </script>
36
+
37
+ <!-- Display the result of the prediction -->
38
+ <!-- First display the prediction and then the json result -->
39
+
40
+ {% if preproc_img_url and pred_img_url %}
41
+ <h3>Preprocessed image (as seen by the model):</h3>
42
+ <img src="{{ preproc_img_url }}" alt="preprocessed" style="max-width: 400px; border:1px solid #ccc;">
43
+ <h3>Prediction image:</h3>
44
+ <img src="{{ pred_img_url }}" alt="prediction" style="max-width: 400px; border:1px solid #ccc;">
45
+ {% endif %}
46
+ {% if result %}
47
+ <h2>Result</h2>
48
+ <div id="result">{{ result | tojson(indent=2) }}</div>
49
+ {% endif %}
50
+ </body>
51
+ </html>
app/tflite_runtime-2.19.0-cp39-cp39-linux_x86_64.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c6c1d62bd5838cad61d790185bbe25f377d034198719c46326be519f6676739c
3
+ size 3378306
data/capsule/license.txt ADDED
@@ -0,0 +1,438 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Attribution-NonCommercial-ShareAlike 4.0 International
2
+
3
+ =======================================================================
4
+
5
+ Creative Commons Corporation ("Creative Commons") is not a law firm and
6
+ does not provide legal services or legal advice. Distribution of
7
+ Creative Commons public licenses does not create a lawyer-client or
8
+ other relationship. Creative Commons makes its licenses and related
9
+ information available on an "as-is" basis. Creative Commons gives no
10
+ warranties regarding its licenses, any material licensed under their
11
+ terms and conditions, or any related information. Creative Commons
12
+ disclaims all liability for damages resulting from their use to the
13
+ fullest extent possible.
14
+
15
+ Using Creative Commons Public Licenses
16
+
17
+ Creative Commons public licenses provide a standard set of terms and
18
+ conditions that creators and other rights holders may use to share
19
+ original works of authorship and other material subject to copyright
20
+ and certain other rights specified in the public license below. The
21
+ following considerations are for informational purposes only, are not
22
+ exhaustive, and do not form part of our licenses.
23
+
24
+ Considerations for licensors: Our public licenses are
25
+ intended for use by those authorized to give the public
26
+ permission to use material in ways otherwise restricted by
27
+ copyright and certain other rights. Our licenses are
28
+ irrevocable. Licensors should read and understand the terms
29
+ and conditions of the license they choose before applying it.
30
+ Licensors should also secure all rights necessary before
31
+ applying our licenses so that the public can reuse the
32
+ material as expected. Licensors should clearly mark any
33
+ material not subject to the license. This includes other CC-
34
+ licensed material, or material used under an exception or
35
+ limitation to copyright. More considerations for licensors:
36
+ wiki.creativecommons.org/Considerations_for_licensors
37
+
38
+ Considerations for the public: By using one of our public
39
+ licenses, a licensor grants the public permission to use the
40
+ licensed material under specified terms and conditions. If
41
+ the licensor's permission is not necessary for any reason--for
42
+ example, because of any applicable exception or limitation to
43
+ copyright--then that use is not regulated by the license. Our
44
+ licenses grant only permissions under copyright and certain
45
+ other rights that a licensor has authority to grant. Use of
46
+ the licensed material may still be restricted for other
47
+ reasons, including because others have copyright or other
48
+ rights in the material. A licensor may make special requests,
49
+ such as asking that all changes be marked or described.
50
+ Although not required by our licenses, you are encouraged to
51
+ respect those requests where reasonable. More considerations
52
+ for the public:
53
+ wiki.creativecommons.org/Considerations_for_licensees
54
+
55
+ =======================================================================
56
+
57
+ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
58
+ Public License
59
+
60
+ By exercising the Licensed Rights (defined below), You accept and agree
61
+ to be bound by the terms and conditions of this Creative Commons
62
+ Attribution-NonCommercial-ShareAlike 4.0 International Public License
63
+ ("Public License"). To the extent this Public License may be
64
+ interpreted as a contract, You are granted the Licensed Rights in
65
+ consideration of Your acceptance of these terms and conditions, and the
66
+ Licensor grants You such rights in consideration of benefits the
67
+ Licensor receives from making the Licensed Material available under
68
+ these terms and conditions.
69
+
70
+
71
+ Section 1 -- Definitions.
72
+
73
+ a. Adapted Material means material subject to Copyright and Similar
74
+ Rights that is derived from or based upon the Licensed Material
75
+ and in which the Licensed Material is translated, altered,
76
+ arranged, transformed, or otherwise modified in a manner requiring
77
+ permission under the Copyright and Similar Rights held by the
78
+ Licensor. For purposes of this Public License, where the Licensed
79
+ Material is a musical work, performance, or sound recording,
80
+ Adapted Material is always produced where the Licensed Material is
81
+ synched in timed relation with a moving image.
82
+
83
+ b. Adapter's License means the license You apply to Your Copyright
84
+ and Similar Rights in Your contributions to Adapted Material in
85
+ accordance with the terms and conditions of this Public License.
86
+
87
+ c. BY-NC-SA Compatible License means a license listed at
88
+ creativecommons.org/compatiblelicenses, approved by Creative
89
+ Commons as essentially the equivalent of this Public License.
90
+
91
+ d. Copyright and Similar Rights means copyright and/or similar rights
92
+ closely related to copyright including, without limitation,
93
+ performance, broadcast, sound recording, and Sui Generis Database
94
+ Rights, without regard to how the rights are labeled or
95
+ categorized. For purposes of this Public License, the rights
96
+ specified in Section 2(b)(1)-(2) are not Copyright and Similar
97
+ Rights.
98
+
99
+ e. Effective Technological Measures means those measures that, in the
100
+ absence of proper authority, may not be circumvented under laws
101
+ fulfilling obligations under Article 11 of the WIPO Copyright
102
+ Treaty adopted on December 20, 1996, and/or similar international
103
+ agreements.
104
+
105
+ f. Exceptions and Limitations means fair use, fair dealing, and/or
106
+ any other exception or limitation to Copyright and Similar Rights
107
+ that applies to Your use of the Licensed Material.
108
+
109
+ g. License Elements means the license attributes listed in the name
110
+ of a Creative Commons Public License. The License Elements of this
111
+ Public License are Attribution, NonCommercial, and ShareAlike.
112
+
113
+ h. Licensed Material means the artistic or literary work, database,
114
+ or other material to which the Licensor applied this Public
115
+ License.
116
+
117
+ i. Licensed Rights means the rights granted to You subject to the
118
+ terms and conditions of this Public License, which are limited to
119
+ all Copyright and Similar Rights that apply to Your use of the
120
+ Licensed Material and that the Licensor has authority to license.
121
+
122
+ j. Licensor means the individual(s) or entity(ies) granting rights
123
+ under this Public License.
124
+
125
+ k. NonCommercial means not primarily intended for or directed towards
126
+ commercial advantage or monetary compensation. For purposes of
127
+ this Public License, the exchange of the Licensed Material for
128
+ other material subject to Copyright and Similar Rights by digital
129
+ file-sharing or similar means is NonCommercial provided there is
130
+ no payment of monetary compensation in connection with the
131
+ exchange.
132
+
133
+ l. Share means to provide material to the public by any means or
134
+ process that requires permission under the Licensed Rights, such
135
+ as reproduction, public display, public performance, distribution,
136
+ dissemination, communication, or importation, and to make material
137
+ available to the public including in ways that members of the
138
+ public may access the material from a place and at a time
139
+ individually chosen by them.
140
+
141
+ m. Sui Generis Database Rights means rights other than copyright
142
+ resulting from Directive 96/9/EC of the European Parliament and of
143
+ the Council of 11 March 1996 on the legal protection of databases,
144
+ as amended and/or succeeded, as well as other essentially
145
+ equivalent rights anywhere in the world.
146
+
147
+ n. You means the individual or entity exercising the Licensed Rights
148
+ under this Public License. Your has a corresponding meaning.
149
+
150
+
151
+ Section 2 -- Scope.
152
+
153
+ a. License grant.
154
+
155
+ 1. Subject to the terms and conditions of this Public License,
156
+ the Licensor hereby grants You a worldwide, royalty-free,
157
+ non-sublicensable, non-exclusive, irrevocable license to
158
+ exercise the Licensed Rights in the Licensed Material to:
159
+
160
+ a. reproduce and Share the Licensed Material, in whole or
161
+ in part, for NonCommercial purposes only; and
162
+
163
+ b. produce, reproduce, and Share Adapted Material for
164
+ NonCommercial purposes only.
165
+
166
+ 2. Exceptions and Limitations. For the avoidance of doubt, where
167
+ Exceptions and Limitations apply to Your use, this Public
168
+ License does not apply, and You do not need to comply with
169
+ its terms and conditions.
170
+
171
+ 3. Term. The term of this Public License is specified in Section
172
+ 6(a).
173
+
174
+ 4. Media and formats; technical modifications allowed. The
175
+ Licensor authorizes You to exercise the Licensed Rights in
176
+ all media and formats whether now known or hereafter created,
177
+ and to make technical modifications necessary to do so. The
178
+ Licensor waives and/or agrees not to assert any right or
179
+ authority to forbid You from making technical modifications
180
+ necessary to exercise the Licensed Rights, including
181
+ technical modifications necessary to circumvent Effective
182
+ Technological Measures. For purposes of this Public License,
183
+ simply making modifications authorized by this Section 2(a)
184
+ (4) never produces Adapted Material.
185
+
186
+ 5. Downstream recipients.
187
+
188
+ a. Offer from the Licensor -- Licensed Material. Every
189
+ recipient of the Licensed Material automatically
190
+ receives an offer from the Licensor to exercise the
191
+ Licensed Rights under the terms and conditions of this
192
+ Public License.
193
+
194
+ b. Additional offer from the Licensor -- Adapted Material.
195
+ Every recipient of Adapted Material from You
196
+ automatically receives an offer from the Licensor to
197
+ exercise the Licensed Rights in the Adapted Material
198
+ under the conditions of the Adapter's License You apply.
199
+
200
+ c. No downstream restrictions. You may not offer or impose
201
+ any additional or different terms or conditions on, or
202
+ apply any Effective Technological Measures to, the
203
+ Licensed Material if doing so restricts exercise of the
204
+ Licensed Rights by any recipient of the Licensed
205
+ Material.
206
+
207
+ 6. No endorsement. Nothing in this Public License constitutes or
208
+ may be construed as permission to assert or imply that You
209
+ are, or that Your use of the Licensed Material is, connected
210
+ with, or sponsored, endorsed, or granted official status by,
211
+ the Licensor or others designated to receive attribution as
212
+ provided in Section 3(a)(1)(A)(i).
213
+
214
+ b. Other rights.
215
+
216
+ 1. Moral rights, such as the right of integrity, are not
217
+ licensed under this Public License, nor are publicity,
218
+ privacy, and/or other similar personality rights; however, to
219
+ the extent possible, the Licensor waives and/or agrees not to
220
+ assert any such rights held by the Licensor to the limited
221
+ extent necessary to allow You to exercise the Licensed
222
+ Rights, but not otherwise.
223
+
224
+ 2. Patent and trademark rights are not licensed under this
225
+ Public License.
226
+
227
+ 3. To the extent possible, the Licensor waives any right to
228
+ collect royalties from You for the exercise of the Licensed
229
+ Rights, whether directly or through a collecting society
230
+ under any voluntary or waivable statutory or compulsory
231
+ licensing scheme. In all other cases the Licensor expressly
232
+ reserves any right to collect such royalties, including when
233
+ the Licensed Material is used other than for NonCommercial
234
+ purposes.
235
+
236
+
237
+ Section 3 -- License Conditions.
238
+
239
+ Your exercise of the Licensed Rights is expressly made subject to the
240
+ following conditions.
241
+
242
+ a. Attribution.
243
+
244
+ 1. If You Share the Licensed Material (including in modified
245
+ form), You must:
246
+
247
+ a. retain the following if it is supplied by the Licensor
248
+ with the Licensed Material:
249
+
250
+ i. identification of the creator(s) of the Licensed
251
+ Material and any others designated to receive
252
+ attribution, in any reasonable manner requested by
253
+ the Licensor (including by pseudonym if
254
+ designated);
255
+
256
+ ii. a copyright notice;
257
+
258
+ iii. a notice that refers to this Public License;
259
+
260
+ iv. a notice that refers to the disclaimer of
261
+ warranties;
262
+
263
+ v. a URI or hyperlink to the Licensed Material to the
264
+ extent reasonably practicable;
265
+
266
+ b. indicate if You modified the Licensed Material and
267
+ retain an indication of any previous modifications; and
268
+
269
+ c. indicate the Licensed Material is licensed under this
270
+ Public License, and include the text of, or the URI or
271
+ hyperlink to, this Public License.
272
+
273
+ 2. You may satisfy the conditions in Section 3(a)(1) in any
274
+ reasonable manner based on the medium, means, and context in
275
+ which You Share the Licensed Material. For example, it may be
276
+ reasonable to satisfy the conditions by providing a URI or
277
+ hyperlink to a resource that includes the required
278
+ information.
279
+ 3. If requested by the Licensor, You must remove any of the
280
+ information required by Section 3(a)(1)(A) to the extent
281
+ reasonably practicable.
282
+
283
+ b. ShareAlike.
284
+
285
+ In addition to the conditions in Section 3(a), if You Share
286
+ Adapted Material You produce, the following conditions also apply.
287
+
288
+ 1. The Adapter's License You apply must be a Creative Commons
289
+ license with the same License Elements, this version or
290
+ later, or a BY-NC-SA Compatible License.
291
+
292
+ 2. You must include the text of, or the URI or hyperlink to, the
293
+ Adapter's License You apply. You may satisfy this condition
294
+ in any reasonable manner based on the medium, means, and
295
+ context in which You Share Adapted Material.
296
+
297
+ 3. You may not offer or impose any additional or different terms
298
+ or conditions on, or apply any Effective Technological
299
+ Measures to, Adapted Material that restrict exercise of the
300
+ rights granted under the Adapter's License You apply.
301
+
302
+
303
+ Section 4 -- Sui Generis Database Rights.
304
+
305
+ Where the Licensed Rights include Sui Generis Database Rights that
306
+ apply to Your use of the Licensed Material:
307
+
308
+ a. for the avoidance of doubt, Section 2(a)(1) grants You the right
309
+ to extract, reuse, reproduce, and Share all or a substantial
310
+ portion of the contents of the database for NonCommercial purposes
311
+ only;
312
+
313
+ b. if You include all or a substantial portion of the database
314
+ contents in a database in which You have Sui Generis Database
315
+ Rights, then the database in which You have Sui Generis Database
316
+ Rights (but not its individual contents) is Adapted Material,
317
+ including for purposes of Section 3(b); and
318
+
319
+ c. You must comply with the conditions in Section 3(a) if You Share
320
+ all or a substantial portion of the contents of the database.
321
+
322
+ For the avoidance of doubt, this Section 4 supplements and does not
323
+ replace Your obligations under this Public License where the Licensed
324
+ Rights include other Copyright and Similar Rights.
325
+
326
+
327
+ Section 5 -- Disclaimer of Warranties and Limitation of Liability.
328
+
329
+ a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
330
+ EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
331
+ AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
332
+ ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
333
+ IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
334
+ WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
335
+ PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
336
+ ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
337
+ KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
338
+ ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
339
+
340
+ b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
341
+ TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
342
+ NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
343
+ INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
344
+ COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
345
+ USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
346
+ ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
347
+ DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
348
+ IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
349
+
350
+ c. The disclaimer of warranties and limitation of liability provided
351
+ above shall be interpreted in a manner that, to the extent
352
+ possible, most closely approximates an absolute disclaimer and
353
+ waiver of all liability.
354
+
355
+
356
+ Section 6 -- Term and Termination.
357
+
358
+ a. This Public License applies for the term of the Copyright and
359
+ Similar Rights licensed here. However, if You fail to comply with
360
+ this Public License, then Your rights under this Public License
361
+ terminate automatically.
362
+
363
+ b. Where Your right to use the Licensed Material has terminated under
364
+ Section 6(a), it reinstates:
365
+
366
+ 1. automatically as of the date the violation is cured, provided
367
+ it is cured within 30 days of Your discovery of the
368
+ violation; or
369
+
370
+ 2. upon express reinstatement by the Licensor.
371
+
372
+ For the avoidance of doubt, this Section 6(b) does not affect any
373
+ right the Licensor may have to seek remedies for Your violations
374
+ of this Public License.
375
+
376
+ c. For the avoidance of doubt, the Licensor may also offer the
377
+ Licensed Material under separate terms or conditions or stop
378
+ distributing the Licensed Material at any time; however, doing so
379
+ will not terminate this Public License.
380
+
381
+ d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
382
+ License.
383
+
384
+
385
+ Section 7 -- Other Terms and Conditions.
386
+
387
+ a. The Licensor shall not be bound by any additional or different
388
+ terms or conditions communicated by You unless expressly agreed.
389
+
390
+ b. Any arrangements, understandings, or agreements regarding the
391
+ Licensed Material not stated herein are separate from and
392
+ independent of the terms and conditions of this Public License.
393
+
394
+
395
+ Section 8 -- Interpretation.
396
+
397
+ a. For the avoidance of doubt, this Public License does not, and
398
+ shall not be interpreted to, reduce, limit, restrict, or impose
399
+ conditions on any use of the Licensed Material that could lawfully
400
+ be made without permission under this Public License.
401
+
402
+ b. To the extent possible, if any provision of this Public License is
403
+ deemed unenforceable, it shall be automatically reformed to the
404
+ minimum extent necessary to make it enforceable. If the provision
405
+ cannot be reformed, it shall be severed from this Public License
406
+ without affecting the enforceability of the remaining terms and
407
+ conditions.
408
+
409
+ c. No term or condition of this Public License will be waived and no
410
+ failure to comply consented to unless expressly agreed to by the
411
+ Licensor.
412
+
413
+ d. Nothing in this Public License constitutes or may be interpreted
414
+ as a limitation upon, or waiver of, any privileges and immunities
415
+ that apply to the Licensor or You, including from the legal
416
+ processes of any jurisdiction or authority.
417
+
418
+ =======================================================================
419
+
420
+ Creative Commons is not a party to its public
421
+ licenses. Notwithstanding, Creative Commons may elect to apply one of
422
+ its public licenses to material it publishes and in those instances
423
+ will be considered the “Licensor.” The text of the Creative Commons
424
+ public licenses is dedicated to the public domain under the CC0 Public
425
+ Domain Dedication. Except for the limited purpose of indicating that
426
+ material is shared under a Creative Commons public license or as
427
+ otherwise permitted by the Creative Commons policies published at
428
+ creativecommons.org/policies, Creative Commons does not authorize the
429
+ use of the trademark "Creative Commons" or any other trademark or logo
430
+ of Creative Commons without its prior written consent including,
431
+ without limitation, in connection with any unauthorized modifications
432
+ to any of its public licenses or any other arrangements,
433
+ understandings, or agreements concerning use of licensed material. For
434
+ the avoidance of doubt, this paragraph does not form part of the
435
+ public licenses.
436
+
437
+ Creative Commons may be contacted at creativecommons.org.
438
+
models/final_model/final_model.tflite ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f9ad41bedf5609e1f92e1470b3119fbc313892f947a3dfae7958f04d34f778b
3
+ size 30643344