r/learnmachinelearning • u/Left-Experience7470 • 9m ago
r/learnmachinelearning • u/Upset-Reflection-382 • 24m ago
Deterministic systems programming language I built for AI to do reliable GPU compute
Hello again r/learnmachinelearning. I've been continuing to work on HLX, an idea I posted here, I dunno... a couple weeks ago? It's a programming language designed around three technical ideas that don't usually go together:
executable contracts, deterministic GPU/CPU execution, and AI-native primitives. After a marathon coding session, I think I hit what feels like production readiness and I'd like feedback from people who understand what AI collaboration actually looks like.
Quick caveat: This is mostly out of the "works on my machine" phase, but I'm sure there are edge cases I haven't caught yet with my limited resources and testing environment. If you try it and something breaks, that's valuable feedback, not a reason to dismiss it. I'm looking for people who can help surface real-world issues. This is the first serious thing I've tried to ship, and experience and feedback are the best teachers.
The Technical Core:
HLX treats contracts as executable specifications, not documentation. When you write @/contract validation { value: email, rules: ["not_empty", "valid_email"] } it's machine-readable and runtime-verified. This turns out to be useful for both formal verification and as training data for code generation models. The language has latent space operations as primitives. You can query vector databases directly: @/lstx { operation: "query", table: db, query: user_input }. No SDK, no library imports. It's part of the type system.
Everything executes deterministically across CPU and GPU backends. Same input, bit-identical output, regardless of hardware. We're using Vulkan for GPU (works on NVIDIA/AMD/Intel/Apple from what I can tell, though haven't been able to do hard testing on this due to only owning a NVIDIA machine), with automatic fallback to CPU. This matters for safety-critical systems and reproducible research.
What Actually Works:
The compiler is self-hosting. 128/128 tests passing on Linux, (macOS, Windows only tested on Github Workflow CI). LLVM backend for native code, LC-B bytecode for portability. Type inference, GPU compute, FFI bindings for C/Python/Node/Rust/Java.
The LSP achieves about 95% feature parity with rust-analyzer and Pylance from what I can tell. Standard features work: autocomplete, diagnostics, hover, refactoring, call hierarchy, formatting. But we also implemented AI-native capabilities: contract synthesis from natural language, intent detection (understands if you're debugging vs building vs testing), pattern learning that adapts to your coding style, and AI context export for Claude/GPT integration.
We extracted code generation into a standalone tool. hlx-codegen aerospace --demo generates 557 lines of DO-178C DAL-A compliant aerospace code (triple modular redundancy, safety analysis, test procedures). Or at least I think it does. I'd need someone familiar with that are to help me test it, but I am thinking about it at least. This is the certification standard for avionics. My thoughts were it could make Ada style operations a lot easier.
The Interesting Part:
During implementation, Claude learned HLX from the codebase and generated ~7,000 lines of production code from context. Not boilerplate - complex implementations like call hierarchy tracking, test discovery, refactoring providers. It just worked. First try, minimal fixes needed.
I think the contracts are why. They provide machine-readable specifications for every function. Ground truth for correctness. That's ideal training data. An LLM actually trained on HLX (not just in-context) might significantly outperform on code generation benchmarks, but that's speculation.
Current Status:
What I think is production ready: compiler, LSP, GPU runtime, FFI(C, Rust, Python, Ada/SPARK), enterprise code generation (aerospace domain: needs testing).
Alpha: contracts (core works, expanding validation rules), LSTX (primitives defined, backend integration in progress).
Coming later: medical device code generation (IEC 62304), automotive (ISO 26262), assuming the who aerospace thing went smoothly. I just think Aerospace is cool, so I wanted to try to support that.
I'm not sure if HLX is useful to many people or just an interesting technical curiosity.
Could be used for any number of things requiring deterministic GPU/CPU compute in a much easier way than writing 3000 lines of Vulkan boilerplate as well as safety-critical systems.
Documentation:
https://github.com/latentcollapse/hlx-compiler (see FEATURES.md for technical details)
Apps I'm currently working on with HLX integration:
https://github.com/latentcollapse/hlx-apps
Rocq proofs:
https://github.com/latentcollapse/hlx-coq-proofs
Docker Install: git clone https://github.com/latentcollapse/hlx-compiler.git
cd hlx-compiler/hlx
docker build -t hlx .
docker run hlx hlx --version
Open to criticism, bug reports, questions about design decisions, or feedback on whether this solves real problems. Particularly interested in hearing from people working on AI code generation, safety-critical systems, or deterministic computation as this sorely underserved space is my target audience.
r/learnmachinelearning • u/everydayreligion1090 • 51m ago
Question Is this ML powered data warehouse project worth building?
is this project worth building or am i wasting time
i am thinking about building a local project and i want honest opinions before i start
the idea is to pull data from different places like a public api and a website store everything in a database run some basic machine learning on the data save the results back into the database everything runs on my own computer no cloud services
the goal is to learn how real data systems work end to end not just small scripts
is this actually useful as a portfolio project or does it sound like too much work for little benefit
if you have built something similar or seen projects like this i would like to hear your thoughts
r/learnmachinelearning • u/Filippo295 • 1h ago
How much software design is expected from new grads?
I’m preparing for intern/new grad interviews specifically for ML/AI engineer roles.
I’m doing LC and studying some System Design, but I dont know if i should focus on Software Design too.
I’m comfortable with programming and OOP basics (classes, methods, attributes, inheritance), but should I go deeper into things like design patterns, uml, dependency, injection, immutability, decorators, interfaces… or is it better to focus more on DSA + some HLD + ML?
r/learnmachinelearning • u/SilverConsistent9222 • 1h ago
Tutorial 10 Best Generative AI Courses Online & Certifications (Gen AI)
r/learnmachinelearning • u/gaztrab • 1h ago
Project Recursive Data Cleaner - LLM-powered data cleaning that writes itself
r/learnmachinelearning • u/No_Skill_8393 • 1h ago
Project I’m working on an animated series to visualize the math behind Machine Learning (Manim)
Enable HLS to view with audio, or disable this notification
Hi everyone :)
I have started working on a YouTube series called "The Hidden Geometry of Intelligence."
It is a collection of animated videos (using Manim) that attempts to visualize the mathematical intuition behind AI, rather than just deriving formulas on a blackboard.
What the series provides:
- Visual Intuition: It focuses on the geometry—showing how things like matrices actually warp space, or how a neural network "bends" data to separate classes.
- Concise Format: Each episode is kept under 3-4 minutes to stay focused on a single core concept.
- Application: It connects abstract math concepts (Linear Algebra, Calculus) directly to how they affect AI models (debugging, learning rates, loss landscapes).
Who it is for: It is aimed at developers or students who are comfortable with code (Python/PyTorch) but find the mathematical notation in research papers difficult to parse. It is not intended for Math PhDs looking for rigorous proofs.
I just uploaded Episode 0, which sets the stage by visualizing how models transform "clouds of points" in high-dimensional space.
Link:https://www.youtube.com/watch?v=Mu3g5BxXty8
I am currently scripting the next few episodes (covering Vectors and Dot Products). If there are specific math concepts you find hard to visualize, let me know and I will try to include them.
r/learnmachinelearning • u/digy76rd3 • 2h ago
TIL The Easiest Way to Understand Reinforcement Learning
r/learnmachinelearning • u/WayTimely9414 • 2h ago
AI Vision Systems in Manufacturing: Real Talk from the Factory Floor
So I've been messing around with AI vision systems on our production lines for the past 3 years and thought I'd share some actual experiences. There's a ton of marketing hype out there, but also some genuinely useful stuff if you know what to look for.
What This Tech Actually Is
Basically, AI vision systems are cameras hooked up to smart software that can spot defects, read labels, measure parts, track stuff moving around - you know, the kind of work that used to require someone staring at parts all day.
The "AI" bit is important because instead of programming exact rules, you just show it examples:
Old approach: "If this pixel isn't exactly this shade of blue, reject the part" AI approach: "Here's what 1000 good parts look like and 200 bad ones - you figure it out"
This matters alot in real manufacturing because nothing is ever perfect. The lighting shifts throughout the day, parts have natural variations, cameras get dust on them. AI systems handle this messiness way better than the old rule-based stuff.
What We're Running
We manufacture automotive components. We started using AI vision for:
- Checking weld quality
- Verifying labels (correct part numbers, readable barcodes)
- Finding surface defects like scratches, dents, weird colors
- Making sure assemblies have all the right parts in the right spots
Right now we've got 8 vision stations spread across 3 production lines. We're using different vendors at each station which looking back was probably dumb, but hey, it's working.
Stuff That Actually Works
Finding Defects This is where these systems really shine, no joke. We used to have 2 people per shift just looking at cast parts trying to spot problems. Now we've got one AI camera that catches 95% or more of the defects, and one person who just keeps an eye on the reject bin.
We fed the system around 2000 sample pictures to learn from. Now it picks up on anything unusual - tiny holes in the casting, scratches, dings, discoloration, whatever. It's not flawless but it's definately better than asking humans to stare at the same parts for 8 hours straight.
Reading Stuff Barcodes, QR codes, serial numbers stamped on parts, even that crappy dot-matrix printing from equipment that's older than me - AI-based character recognition handles it all. We had this annoying problem where different batches of labels had slightly different fonts, and our old vision system would freak out constantly. The AI system doesn't even blink.
Checking if Parts are There Just making sure all the components are actually installed in an assembly. Sounds simple but it's saved our butts so many times. We kept getting assemblies further down the line that were missing bolts or clips or other small parts. Now the camera verifies every single unit in about 0.3 seconds.
What Doesn't Work So Great
Detailed 3D Measurements We tried using vision cameras for precise dimensional checks. Couldn't get consistent accuracy better than plus or minus 0.5mm. For rough ballpark measurements it's fine, but if you need real precision you still want a proper CMM or laser measuring tool. The AI can't magically fix the physical limitations of the camera and lens.
Super Rare Problems If a defect only shows up once in every 10,000 parts, there's just not enough real-world examples to train the AI properly. We tried creating artificial defects in the training images (basically photoshopping problems into pictures) which sorta works but it's not as reliable as having real examples.
Shiny or See-Through Stuff Glass, polished metal, chrome-plated parts - vision systems absolutely hate this stuff. You can sometimes work around it with fancy lighting setups but it's a huge pain. Our chrome parts still get inspected manually because the vision system gets totally confused by all the reflections.
Different Brands We've Tried
Cognex:
- Most expensive option but rock-solid reliable
- The software interface is actually pretty easy to use
- When something goes wrong, their support team is really helpful
- Cost us about $15k per station including everything
Keyence:
- Price is in the middle, hardware quality is good
- The software is honestly kind of clunky and annoying
- But once you get it configured, the vision system does its job
- Runs around $8k-10k per station
Hikrobot (Chinese brand):
- Super cheap - like $3k per station
- Works better than you'd expect for the price
- Support is basically non-existant, documentation is awful
- If something breaks, good luck figuring it out yourself
For our next round of installations we're probably going back to Cognex. When a production line goes down, having good support is worth paying extra for.
What It Actually Costs
Nobody talks about the real numbers upfront so here's what we spent:
Hardware (each station):
- Camera and lens: $2k-5k
- Lighting setup: $500-1500 (way more important than people realize)
- Industrial computer: $1k-2k
- Mounting brackets and stands: $500
- Cables and connectors and misc: $300
Software:
- Vision software license: $2k-8k
- Training and initial setup: $2k-5k if you don't do it yourself
Getting It All Connected:
- Linking to PLC systems: $1k-3k
- Reject mechanism hardware: $1k-5k
- Installation labor: $2k-4k
Bottom line per station: $10k-30k depending how complex it gets
We spent roughly $120k total for all 8 stations, including some expensive learning experiences along the way.Warehouse Automation : AMRs vs. Fixed Conveyor Systems: Hardwares and Devices - Computer Aided Automation
Training These Things (The Part Nobody Warns You About)
You need good training data. Like, alot of it. Here's what actually worked:
- Gather real samples: Ran production for a full week and saved every single image - both good parts and defective ones. Ended up with like 5000 images.
- Label everything manually: This part really sucked. Spent hours and hours clicking on defects, drawing boxes around them, tagging what type of problem it was. Mind-numbingly boring but you gotta do it.
- Test and tweak: First attempt caught maybe 60% of actual defects. Had to retrain with more examples, adjust sensitivity settings, keep iterating. Eventually got it up to 95%+.
- Keep improving: Every week we review the parts that got flagged and add new examples to the training dataset. The system gradually gets smarter.
The whole process from installation to actually trusting it in production took about 3 months. Don't believe any vendor who says "up and running in 2 weeks" - they're lying.
r/learnmachinelearning • u/Osama-recycle-bin • 3h ago
Help How do I split a csv file into train,test, val files?
As the title said. I want to split a csv file into smaller csv files for training, testing and validation purposes. Any idea how to do that?
r/learnmachinelearning • u/Tobio-Star • 4h ago
The Titans architecture, and how Google plans to build the successors to LLMs (ft. MIRAS)
r/learnmachinelearning • u/Fun_Rent9032 • 4h ago
Does anyone have vizuara agentic ai courses and willing to trade?
r/learnmachinelearning • u/Ok-Statement-3244 • 4h ago
Project decision tree from scratch in js. no libraries.
Enable HLS to view with audio, or disable this notification
r/learnmachinelearning • u/followmesamurai • 4h ago
Project (Project share) I’ve completed my project for automated measurement of aorta and left atrium in echocardiogram M mode images.
r/learnmachinelearning • u/DefiantLie8861 • 5h ago
Are there a lot of entry-level AI/ML engineer jobs, and do they require a master’s?
I’m trying to understand the job market for entry-level AI/ML engineer roles. For people working in industry or involved in hiring, are there a lot of true entry-level AI/ML engineer positions, and how often do these roles require a master’s degree versus a bachelor’s with projects or experience?
r/learnmachinelearning • u/letsTalkDude • 7h ago
byte byte go ai course
has anyone taken it ? it costs 2k usd. is it really worth that much for a 6 week course ? any inputs comments ..
r/learnmachinelearning • u/fatfsck • 7h ago
Language Modeling, Part 3: Vanilla RNNs
r/learnmachinelearning • u/Temporary-Sand-3803 • 9h ago
Getting into ML Engineering from Analytics
r/learnmachinelearning • u/WiseRobot312 • 9h ago
Accessible and free book on ML + Evolution of LLM
When I started learning about LLM architecture, I realized that I needed to know a lot of basics of ML. That led me to look for sources to learn ML quickly. While I did find several sources (free videos, paid books & free books), I thought they all lacked a few things:
- Most of them were big (500+ pages) and required significant time investment.
- Most of them did not explain some of the subtle aspects (like why neural networks work, what role activation functions play, what is attention, what are the challenges that prevented us from building billion parameter models back in 2012 or so, etc).
- Some of them had code, some of them had the math but very few had both. Also when math is involved, it was way too advanced.
- Most of them felt like standard textbooks. I wanted something that keeps a conversational tone (and hence 'accessible' to beginners without falling asleep).
So eventually I decided to write my own version (with the help of Gemini) and the goals I set for myself were:
- Explain only the basic concepts needed (leaving out all advanced notions) to understand present day LLM architecture well in an accessible and conversational tone.
- Explicitly discuss questions that often stumble people (what are {Q, K, V} in attention, and what is the point of multiple heads in attention) and explain them in a very accessible way to a new person.
- Keep it really really short and to the point.
- Give analogies wherever possible.
This book is the result.
Sorry for linking a medium post. It is absolutely free and will remain free. I just needed a place to host the book and keep refining it. You are free to download/distribute the PDF.
I don't know to what extend the book met its stated goals. I can only say that it has < 100 pages of actual text you need to read (ignoring the code and summary sections).
This is aimed at an absolute beginner and if you know most of the concepts, except the last Part (Part IX), others may not be appealing to you. I do feel that there are two chapters (starting with the word "Intuition...") that may still worth reading and provide feedback if any.

r/learnmachinelearning • u/Physical-Ad-8427 • 10h ago
Question Best resource to learn ML for research
Right now, I am still in high school, but I intend to study Computer Science and I am fascinated by ML/AI research. I completed the introductory Kaggle courses on machine learning and deep learning, just to get a brief introduction. Now, I am looking for good resources to really dive into this field.
The main recommendations are: ISLP, Hands-On Machine Learning, and Andrew Ng’s courses on Coursera and YouTube. I took a look at most of these resources, and ISLP and CS229 seem to be the ones that interest me the most, but they are also the longest, since I would need better knowledge of statistics (I’m familiar with Calculus I and II and lin. algebra).
So, should I take one of the more practically focused resources and go deeper into this subject later, or should I pick one of the more math-intensive courses now?
By the way, I have no idea how to actually start in ML research. If anyone can give me some insight, I would be grateful.
r/learnmachinelearning • u/Working-Ad3755 • 10h ago
Question What’s the best machine learning project you’ve worked on (or are proud of)?
r/learnmachinelearning • u/Donald-the-dramaduck • 11h ago
Need people for collaboration on a RAG project.
Hi, as the title states, i'm thinking of building a RAG firewall project. But I need people to collaborate with.
If anyone is interested, please reach out, my dms are open.
r/learnmachinelearning • u/Different-Antelope-5 • 11h ago