EdgeCaseForge AI is a Python/Streamlit app for building coding-evaluation tasks for AI models. It helps generate problem statements, edge-case traps, hidden-test ideas, golden-solution hints, and sample validation for Python solutions.
This project was built as a portfolio project to demonstrate practical AI-evaluation, software engineering, and test-design skills.
AI coding models can often solve simple examples, but they still fail on edge cases, ambiguous requirements, hidden constraints, and tricky validation logic.
EdgeCaseForge AI focuses on the harder part of AI evaluation:
- designing coding tasks that test real reasoning
- identifying where models may fail
- creating hidden tests and edge cases
- validating sample solutions
- explaining the golden-solution strategy
- Challenge library with multiple coding task categories
- AI failure analysis for each problem
- Hidden test-case ideas for stronger evaluation
- Golden-solution hints for benchmark design
- Python solution runner for sample tests
- Portfolio pitch generator for explaining the project
- Python
- Streamlit
- JSON
- Subprocess-based local test runner
EdgeCaseForge-AI/
├── app.py
├── requirements.txt
├── README.md
├── DEMO_SCRIPT.md
├── LICENSE
├── .gitignore
├── data/
│ └── challenges.json
├── examples/
│ └── ledger_solution.py
└── assets/
└── preview.png
git clone https://github.com/BeauDevCode/EdgeCaseForge-AI.git
cd EdgeCaseForge-AIpip install -r requirements.txtstreamlit run app.py- Choose a coding challenge.
- Read the problem statement and constraints.
- Review the likely AI model failure points.
- Study the hidden-test ideas.
- Paste a Python solution using this format:
def solve(input_data: str) -> str:
return "your answer"- Run the sample tests.
Room Collision Validator
A 2D top-down game level contains rectangular walls. A player is represented as a circle. The task is to detect the first movement step where the player collides with a wall.
Why this is difficult for AI models:
- Many models only check whether the circle center is inside the rectangle.
- Correct collision requires circle-rectangle overlap logic.
- Reversed wall coordinates must be normalized.
- Touching edges should be handled carefully.
EdgeCaseForge AI is a Python/Streamlit app that helps design hard coding tasks for AI model evaluation. It creates problem statements, hidden test ideas, golden-solution hints, and model-failure explanations. I built it to show how AI coding models can be tested beyond simple examples, using edge cases and validation logic.
Building this project helped me practice:
- designing better coding problems
- thinking like an AI evaluator
- writing clear test cases
- creating edge-case-driven validation
- building a clean Streamlit app
- structuring a GitHub portfolio project
- Add Docker sandboxing for safer code execution
- Add JavaScript and C++ solution runners
- Add downloadable challenge packages
- Add difficulty scoring based on edge-case coverage
- Add optional LLM-assisted challenge generation
This project is licensed under the MIT License.
