There are a bunch of questions about Julia, so I'll do my best to give a short answer to a very long and complicated topic. Up front, Julia is a wonderful language and a wonderful community, I am a super fan.
That said, Mojo is a completely different thing. It is aligned with the Python community to solve specific problems outlined here:
https://docs.modular.com/mojo/why-mojo.html
Mojo also has a bunch of technical advancements compared to Julia by virtue of it being a much newer development and being able to learn from it (and Swift and Rust, and C++ and many many other languages). Including things like ownership and no GC. We also think there is room for a new language that is easier to deploy, scales down to small envelopes, works directly with the full Python ecosystem, is designed for ML and for MLIR from first principles, etc.
Julia is far more mature and advanced in many ways. Many folks have and will continue to push Julia forward and we wish them the best, it is a lovely ecosystem and language. There is room for more than one thing! :)
EDIT: Just in case there is any confusion, I work for Modular, built LLVM, Swift, Clang, MLIR and a variety of other things. I wasn't trying to misrepresent as being unaffiliated.
Congratulations on the launch! I think it's always a great thing to have people who know what they're doing put a new design out there. Raises the bar for everyone. For example, I think the Rust folks' work on error messages has really raised the bar on what is expected of systems is that regard. Sometimes people working on older systems feel a bit uncomfortable they see the bar being raised on them (it's weird to me to think of Julia as on "older" system, but I guess being around for more than a decade counts), but I actually prefer to think about it in the opposite way. There are lots of forces in systems design that push production systems towards conservatism. "We now have a million users, is it really worth spending the time/effort/risk on this new experimental parser/optimizer/feature, etc?", but if the rest of the world raises the bar, it's a great incentive to keep up and make all our systems better. So I very sincerely wish you the best of luck here and that in the areas where there might end up being overlap between Julia and Mojo people will start complaining to us that we need to be better, because we might just take them up on it ;).
> Julia is far more mature and advanced in many ways. Many folks have and will continue to push Julia forward and we wish them the best, it is a lovely ecosystem and language. There is room for more than one thing! :)
In general this tends to be true. However, in this case I'm not so sure. Modular seems to have garnered a lot of investment - probably orders of magnitude more than the Julia community has been able to get. There are a lot of nagging problems in Julia (startup times - though that's gotten better recently, ML kernel performance, and executables come to mind) that could have been easily fixed if they had the money to throw at them. Since they haven't had that kind of investment people who kick Julia's tires tend to see these things as built-in limitations and move on.
This is too real and any of the great investment of manpower (e.g. Tensorflow for Swift) if happened to Julia would probably be 10x or 100x in terms of ROI -- just look at how few devs and line of code Julia's alternative to pandas/numpy/ML/autodidf/plotting has. If Julia ecosystem can be somewhat competitive while only having part time and researchers' side project contributors, it WILL thrive if properly invested.
Just a thought, but perhaps it's the small community and independent culture of Julia that has led to the high quality of its software. Small groups of highly passionate people can accomplish a lot! If I recall the history correctly, scientific Python (numpy, scipy, etc) developed similarly at first and has mostly supplanted Matlab, Fortran, and other tools. There was a point in time when Python was considered niche and not for "serious" work :).
Hard disagree. I think some people (not necessarily you, but way too many people) weirdly envy too much that "lone/few geniuses" image, and when they see bigger communities and their problem, they fallaciously/unfairly assume it's because of the size of the community (and not say, unnecessary bureaucracy that a small part of that community decided to have/had early on long before they got big).
One of Julia's often complained about issues is that it could use way more developers than it has now. No amount of romanticizing a small community or "independent culture"(which I just can't see going away due to where a lot of people that come to Julia are coming from in the first place) is going to fix that, just more people coming aboard the ship.
Julia (maybe by virtue of being a Lisp?) is tuned for the "lone genius developer" use case. Its feature set for enabling working in teams, where you'd want more well defined interfaces, explicit structure, control, checks, enforcement mechanisms for conventions, etc... seems weak by comparison with most other modern languages.
So its not so obvious that a larger investment would scale so we'll immediately. (But it might force Julia to get better with these things...)
Julia 1.0 was released in 2018. That is just 5 years ago. I would say that is very young. Especially since a language today needs more than in the past. Julia has package manager, virtual environments, version management. Stuff that tends to be bolted on much later.
You needed much less stuff supported out of the box when Python first came on the scene. Today expectations have gotten much bigger. A minimal viable language has far more requirements.
people used to talk about R vs Python for data science many years ago, for instance, and we all know how that ended. imo if Mojo lives up to these claims, the pitch of Python compatibility is almost certainly too compelling to ignore. not to say Julia will die; R still has its uses and dominates some niche areas
R as a language was nothing to write home about. All it had going for it was its libraries. I never rooted for R, and I don't care for python anymore either. Python was general purpose, had enough libraries, and a bigger community. Its victory was predictable.
Congratulations on the launch and work so far! This is indeed very interesting. I believe a language like this is necessary for future progress. I have also been impressed with Julia and its progress. I would have liked to see more progress on Ahead of Time compilation and using it to write Python extension modules.
For Mojo, I'm interested in seeing how the language can be used as a path forward for the Cython community. This could be a stepping-stone towards reimplementation of Python in Mojo. For the past 3 year, I have been talking about the need for a Python-Steering-Council-recommended extension language for Python. This will be particularly important as WebAssembly keeps progressing and potentially redefining what we mean by virtualization and containers.
We have already been using LLVM extensively in Numba and there have been several explorations around MLIR and related technologies. There are several potential paths forward and I'm looking forward to finding ways to cooperate.
Understanding what will be open-source is of course, critical for that.
>Python has amazing strengths as a glue layer, and low-level bindings to C and C++ allow building libraries in C, C++ and many other languages with better performance characteristics. This is what has enabled things like numpy, TensorFlow and PyTorch and a vast number of other libraries in the ecosystem. Unfortunately, while this approach is an effective way to building high performance Python libraries, its approach comes with a cost: building these hybrid libraries is very complicated, requiring low-level understanding of the internals of cpython, requires knowledge of C/C++/… programming...
But the cost has already been paid. We have NumPy, we have PyTorch and TensorFlow. So I don't see the value-add here. Maybe there's something I'm missing.
A clear complexity cost difficulty in extending these libraries. This separation force libraries to become huge complex monoliths. In Julia the equivalent is done with absolutely hilariously tiny libraries. In fact they are so small that many Python guys exploring Julia decide to not explore further thinking most of the Julia ML libraries aren't done or have barely started.
They are just not accustomed to seeing libraries being that small. That is possible in Julia because it is all native Julia code which means interfacing with other Julia code works seamless and allows you to mix and match many small libraries very easily. You can reuse much more functionality which means individual libraries can be kept very small.
For PyTorch and TensorFlow e.g. activation functions have to be coded specifically into each library. In Julia these can just be reused for any library. Each ML library doesn't need to reimplement activation functions.
That is why you get these bloated monoliths. They have to reinvent the wheel over and over again. So yeah there is a cost which is constantly paid.
Every time you need to extend these libraries with some functionality you are paying a much higher price than when you do the same with Julia.
Taichi already allows for amazing speedups and supports all the backends (including Metal) https://www.taichi-lang.org/ the fact that they didn't mention Taichi is a glaring omission.
This looks like a reboot of Numba with more resources devoted to it. Or a juiced up Shedskin. https://shedskin.github.io/
I think Mojo is a minor mistake, I would caution adoption, it is a superset fork of Python instead of a being a performance subset. Subsets always decay back to the host language. Supersets fork the base language. I would rather see Mojo integrated into Python rather than adopting the language and extending.
I have a theory why Mojo exists. The team was reading a lot of PyTorch and Tensorflow, getting frustrated with what a PIA working with C++ and MLIR is for their model-backend retargeting codegen, so they created their perfect C++/Python mashup rather than use TVM. They nerd sniped themselves and rather than just use Python3.12+mypy, they made a whole new language based off of Python.
Very strange to have no GC as a innovative feature for a modern programming language. Personally I think Dlang get it right by making GC as a default and provide no GC as an optional feature.
As a comparison, auto industry is moving toward fully automatic transmission especially for the EV but software industry is still undicided and seems cannot even come up with a robust GC mechanism that is on par with no GC in term of performance.
With no GC, interpreted programming language e.g. Python will most probably being used well into the future alongside Mojo/C++/Rust because majority of AI/data science/machine learning programmers cannot even bother to touch the underlying codes for the fear of programming complexity of these no GC languages.
> Personally I think Dlang get it right by making GC as a default and provide no GC as an optional feature.
I vehemently disagree. D's GC is the #1 reason for its fade into obscurity, instead of becoming a viable C++ competitor. Now it's completely overshadowed by Rust's success.
The truth is we can only speculate whether the main reason D is less popular now due to its GC or D can be more popular now if it never had GC. Most of the D standard libraries are written in no GC thus performance is not an issue there and D with GC is not a sloth either. Based on programming language (PL) popularity evidence, generally for the similar age language for example Python (1991) vs C++ (1985), and Go (2009) vs Rust (2015), the PL with GC are way more popular than those without GC [1].
Regarding Rust vs D popularity, time will tell. When at the same age of D now, circa 2000s Perl was notably more popular than Python but then Perl lose its steam and fade into obscurity.
Regarding the sibling's comment on borrow checker, D now can support borrow checker and it's just a feature like its many capable features and Rust actually took it from Cyclone [3].
Traditional GC isn’t less complex to program then automatic reference counting. Traditional GC has its place in short running extension languages, but in longer running programs you run the same risk of memory leaks as automatic reference counting since you can still over-retain from a poor ownership model. What goes wrong is slightly different, but with automatic reference counting it is easier for the compiler to find and report these issues. I feel you are conflating this with no automatic memory management which would be a higher barrier. Automatic reference counting greatly simplifies inter-op with low-level code and running on specialized hardware vs Python with interop.
RC is GC, but without cleaning up cyclic references. In general, leaking memory is not a security problem (and with a good debugger it is trivial to fix), so I don’t think it is too important a property. Leaking every cyclic reference is a bit more of a problem, but it has solutions like the occasional tracing phase like with a “traditional”, mark’n’sweep GC, but at that point why not just go with them as they are much more performant?
> Very strange to have no GC as a innovative feature for a modern programming language.
It's because, first, there was manual memory management, which was error-prone.
Then came garbage collection, which was safe but slow.
Most recently came borrowing and move semantics, which offers the best of both worlds -- safety and speed -- at the cost of some (arguably justified) cognitive overhead and code flexibility.
GC is first and foremost a code design tool. Not having a GC causes less flexible code, every refactor will likely has to reimagine the memory ownership model at every relevant area.
So I absolutely don’t think of the borrow checker as superior to a GC, it’s a different tool with different tradeoffs. It was a good choice for a low-level language like Rust, but not necessarily for a high level language.
They claim they are writing the actual kernel code (as in the implementation of a matmul) with it, and it was presented as a "system programming language": this goes far beyond "high-level tasks" it seems.
I have wanted deterministic timing in Julia for years. It's come up periodically in the forums.
That, and optimized static binaries, would make Julia truly general purpose.
I don't mind Python syntax, I hope Mojo lives up to all these claims...it could easily (and finally!) hit the sweet spot of C performance, elegance, and expressiveness!
Yes. There are already efforts in that direction, such as Julia Robotics (juliarobotics.org).
With Julia, there is work to be done to get to small, optimized, static binaries - which it sounds like Mojo will provide out of the box, given it’s targeting resource-constrained (and novel) hardware.
The Rust-inspired features are also VERY interesting!
This is not true at all! Julia usually being JIT-compiled makes it very unsuitable for real time applications (and there's no reason why it should be great for it). GC is the least issue here, and I say that as a fan and daily user of Julia.
Julia is already being used in a number of systems with varying levels of real-time-ness. It's early days for sure, but there's a bunch of progress. It turns out lots of "real time" applications actually just need fast code. JIT compile also can be overcome since in real time systems, you generally know all the code you are going to call so you can just make a .so with all of those bits compiled and know that nothing will need to be compiled at runtime.
The coolest I know of is ASML putting Julia on their giant chip making machines https://www.youtube.com/watch?v=EafTuyy7apY. I think there's also work on putting Julia on satellites, but I don't have the details.
Note that in Julia 1.9 (rc3 currently) and thanks to `PrecompileTools.jl` being adopted by the vast majority of the packages, the compilation latency is a dead issue... (ok, it is moved to the package installation time...)
PrecompileTools.jl has definitely not been adopted by the "vast majority of packages" and compilation latency is not a dead issue. There's been huge progress, and the rate of progress on this is much higher than before, but let's not get ahead of ourselves.
That is not really true. People think that because they have spent time with Java which is excessively GC dependent. More modern GC languages such as Go and Julia have opted to use GCs in a far more conservative manner. They don't produce the exorbitant amount of garbage that Java produces. I've talked to NASA guys using Go for real time systems. They say it works great. GC doesn't need to be a problem if you do it right.
Yes, I'm not saying it's not possible, but real time abilities were likely one of the least important aspects of Julia's design... So why shoehorn it into something it's not been designed for
After the 1st JIT-compilation (which here we treat as C++ static compilation), there is no compilation cost. As long as you avoid doing dynamic things (e.g. GC), there's a great case to be made for real time Julia.
Of course, but other than in C++ you need to use ways and means to achieve this - never hitting u compiled code - that are not very natural to Julia. Yes it may be possible to compile Julia code to binaries, but it obviously is neither straightforward nor widely used.
You could do a lot worse than modern GC. For instance, in Rust, lack of GC may cause you to reallocate memory or use reference counting, both of which are likely slower than GC. For instance, building a tree in Rust will do a lot of reference counting, which is a lot of heap accesses.
> building a tree in Rust will do a lot of reference counting
This isn't true in most cases. If every subtree is only referenced by its (unique) parent, then you can use a standard Rust "Box", which means that during compilation, the compiler inserts calls to malloc() (when the Box is created) and free() (when the Box goes out of scope). There will be no reference counting — or any other overhead — at runtime.
Are there any plans on making linear algebra, numeric operations and so on first class citizens, the same way Julia does it? I believe that also a very compelling characteristic from Julia, for people working in Physics, Mathematics, Artificial Intelligence and Machine Learning is all the "sugar syntax" Julia gives you, is not the same translating a formula from a paper to numpy than translating it to Julia, in Julia writing the expression from a paper is almost a one-to-one mapping. I believe it will be a huge point in favor to Mojo, since UX when writing mathematical code in Julia is a huge advantage to it, the same way Python gained popularity as a General Purpose language because it is almost like writing plain English
As a former apple eng I guess I forgot that some people don’t know who Chris Lattner is haha. There isn’t anything unscrupulous going on here he probably just thought posting under his username was enough of disclaimer.
Congrats on the launch! In addition to the dramatic possibilities for ML, I hope mojo will have an impact in other scientific software through the ease of specialization to hardware capabilities. The current common subset of Python and “pure” mojo (ie not using CPython) is small, but will expand according to your roadmap. Do you envision porting pure Python libraries as part of the effort, or perhaps help other communities do so? For example einsum in numpy or pytorch are great, but how much more work would be needed to have a mojo-accelerated einsum interfacing with numpy or mojo-specialized arrays? Or have a pure mojo numpy altogether?
I think some of this consideration, and resulting work is already being looked at in the context of Numba. They came up with PIXIE (Portable Instructions eXchanged in Executable), the concept being to compile your functions into a shared library which embed the low-level compiler IR, and can then be JIT compiled, or "pulled" in at link-time.
Modular can do exactly the same with existing Python packages, and lift them into Mojo that way.
I really like the integration of lower-level memory control in a superset of Python. Trying to maintain compatibility with such a large and varied ecosystem is a daunting task, and has the opportunity to be very valuable for many people who don't want (or are unable) to move away from the Python ecosystem. Kudos to you all for tackling such a difficult problem! Good luck, I look forward to seeing how you guys help to increase efficiency across the board!
I notice that Mojo still seems to use numpy or something that looks "numpyish" for compatibility. Will Mojo also have an alternative syntax for doing things like matrix multiplication that looks more native like Julia's?
Just wanted to share how much of fan of your work i am. I wish you could gave stayed in the Swift ecosystem. i feel like the language has lost its track since you left.
Looking forward to see where this new adventure will lead, and congratulations !
A lot of companies build web api's around their ML/AI code and data pipelines in Python, is Mojo suited for these task
as well or is it specialized for numerical/AI/ML tasked?
- Compile time language is ~ the runtime one. That's the model most systems languages seem to have ended up at. It doesn't have macros or reflection, but neither of those seem to be very popular
- The struct/let notation and lifetime model essentially give you C++ semantics with saner syntax. The class/def notation essentially give you python semantics. This is a clean answer to the gradual-typing-for-performance problem
- You can drop into MLIR at will and the std types are implemented like that. This makes the language look a lot like syntactic sugar over writing the IR which closely matches how (at least some) compiler devs think about programming languages
Yeah, I think that'll work. Python/C++ mashup is a popular dev stack and this can make that much cleaner. Faster than C is rather unproven but given the difference in compile time control should be achievable.
If this succeeds, it will allow you to use Python for the entire AI stack: high-level model composition (as usual), fast compiled CPU code (instead of, say, libs written with C++), and on-device operations (instead of, say, libs that use CUDA). Oh, and it will make your Python code parallel (i.e., there's no GIL).
Obviously, we'll have to wait until Mojo is production-ready, but I'm excited after seeing Jeremy Howard's easy-to-follow examples during the live keynote presentation. Jeremy, who sometimes hangs out on HN, must have been dying to tell the world about this for a long while.
>>If this succeeds, it will allow you to use Python for the entire AI stack [...]
If this succeeds, the terminal endpoint will be the Python Software Foundation adopting Modular as the defacto and eventually official implementation since, as Modular noted in their docs, they effectively need Mojo to be absolutely amazing on generalized host CPUs as the key enabler allowing for the unified Python-superset experience across other types of general and specialized hardware ("xPU").
Julia will be dealt an adoption setback proportional to Mojo's growing success.
No it won't. The language is not a super-set of Python, its another language that somewhat resembles Python. You can't drop in some Python code and run it with Mojo and expect it to work.
The goal is for the language to be a superset of Python
> Further, we decided that the right long-term goal for Mojo is to provide a superset of Python (i.e. be compatible with existing programs) and to embrace the CPython immediately for long-tail ecosystem enablement.
> 1. We utilize CPython to run all existing Python3 code “out of the box” without modification and use its runtime, unmodified, for full compatibility with the entire ecosystem.
> 2. We will provide a mechanical migrator that provides very good compatibility for people who want to move Python code to Mojo.
> Together, this allows Mojo to integrate well in a mostly-CPython world, but allows Mojo programmers to be able to progressively move code (a module or file at a time) to Mojo. This approach was used and proved by the Objective-C to Swift migration that Apple performed.
Maybe the long-term goal is to try to make it a true superset, but it sounds like the detailed plan is more practical, to basically provide a "Python-next" language, like Swift for Objective-C.
This isn’t Python though, right? It’s just a Python-like syntax for a much lower level programming language (sort of like Cython) from my cursory glance? It seems a lot like Rust with a Pythonic veneer, so I would expect it will run into many of the same problems that Rust has (lots of difficulty pacifying the borrow checker)?
The source code is Python (well, a superset of Python), but it's not interpreted and run by the official Python interpreter. Judging by what I've seen so far, it seems Modular has been able to "Rustify+Tritonify" Python code in a way that to me feels... very Pythonic.
Maybe someone can correct me, but I don't think it's a superset of Python--I don't think it will accept existing Python programs and I'm pretty sure you can't do all of the dynamic Python stuff (otherwise you would have to essentially compile/link an interpreter into your binaries giving you these super slow paths that aren't necessarily obvious when reading source code).
> Judging by what I've seen so far, it seems Modular has been able to "Rustify+Tritonify" Python code in a way that to me feels... very Pythonic.
I think it "feels Pythonic" because you're looking at Pythonic syntax. I suspect when you're banging your head against a borrow checker it will feel more like writing Rust. Like I'm pretty sure anything with a borrow checker will have to deal with lifetimes, mutable-vs-immutable references, shared mutability (cell/refcell in rust), etc; I don't know how you "Pythonify" that in any meaningful way?
Exciting stuff indeed, looks like a unified approach to recent developments like (OpenAI) Triton, TVM, OpenXLA, ONNX Runtime, etc. though we'll have to wait for proper benchmarks and compatibility to know how great it is. Also it's a bit of a bummer that the compiler is behind a waitlist for now
Hi Jeremy Howard here. I pop up in the launch video to demo super-fast matmul and mandelbrot in mojo. I'm pretty excited about this language, to say the least - not just for AI/ML, but for pretty much everything!
Lemme know if you have any questions and I'll answer as best as I can. (I'm an advisor to Modulo.)
Are there plans to port any major existing Python libraries to pure mojo (mojo that will not use CPython)? Is there a plan for a general purpose matrix or tensor library like numpy or pytorch, but in pure mojo?
I'm not sure if it officially qualifies as a "plan", but a pure mojo numpy has certainly been discussed... personally I'd guess that's something that will need to be built to have this all work as well as possible.
Although there are no concrete implementations to review, the implication was that Mojo will/can be used standalone (similar to a Python implementation), so one is not necessarily locked onto the Modular platform. Was this a correct understanding?
Also, will the next version of the FastAI library be written in Mojo?
Mojo can actually create fully standalone binaries, so you don't even need a Mojo install to run them! Also the binaries are really small compared to languages that need a big runtime (e.g. a binary containing the matmul implementation is ~100kb).
I'm not sure how long it will take before there's enough ML/DL functionality to write something like fastai in Mojo. I think there will be at least one more major version of fastai in Python. And even when Mojo can support what fastai needs, I expect to continue to supporting fastai on python as well.
So, should the absence of an answer to the question of a standalone compiler be interpreted as that this will not happen?
Reason for asking: Without a stand-alone, open source compiler chain, Mojo will be out of question for one of the most exciting application areas I can think of: Computational biology.
How does that work given that the binary needs to lug around a CPython binary (unless you want to depend on the OS python)? Also, how does this deal with python libraries?
but won't that include most applications for the foreseeable future? I assume no one is going to be rewriting the entire data processing/web stack from python to mojo in the near future, right?
Separate question: what is the eventual monetization strategy? Of course, feel free to not answer, though if that plan is not a secret and makes sense to people, it might help spread the technology.
Heh I don't even know that myself - I'm just focussed on playing with Mojo. But note that Modular also has an inference engine which looks pretty amazing and seems likely to bring in big $$$ AFAICT.
Hi is there any plan to have high level model training tools (one example is things like Pytorch dataset loaders), or is the focus more on inference/deployment use cases?
Hey, so this is kinda tangential, but I thought only Google can target TPUs, at least without going through an intermediate layer like OpenXLA. How are you handling that?
I'm not sure of the details, but I believe that XLA is starting to use MLIR now, which is what Mojo uses behind the scenes. I'm not sure if Mojo will use MLIR directly or go via XLA.
It looks like the Mojo compiler/runtime hooks into the CPython interpreter.
> We utilize CPython to run all existing Python3 code “out of the box” without modification and use its runtime, unmodified, for full compatibility with the entire ecosystem. Running code this way will get no benefit from Mojo, but the sheer existence and availability of this ecosystem will rapidly accelerate the bring-up of Mojo, and leverage the fact that Python is really great for high level programming already.
Is this closed source? I just assumed that a new programming language would be open source but I don't see any links to GitHub or any other place. The getting started page also requires you to put in your name and contact information.
Oh god I hope this is not a close source language. I have spent 10+ years fighting the MATLAB ecosystem, I prefer not to spend the next 10 years fighting this thing.
Matlab existed for a reason, and has since become irrelevant, in particular in the face of python being used as a more flexible open source scientific computing scripting language.
If this is closed source, it's already as irrelevant as matlab so no reason to fight it. If there are useful bits there will be python versions of them.
> Matlab existed for a reason, and has since become irrelevant
My man I wish this was true. I am not exaggerating the 10+ year thing, there are very important industries being run today on MATLAB. You remember the moment when you learned that a lot of wall street runs on excel? This is the moment you learn a lot of silicon manufacturing runs on MATLAB. An obscene amount.
Silicon manufacturing, aerospace, automotive, robotics, the list goes on. It’s a well-entrenched ecosystem that makes a decent chunk of cash and has a lot of staying power. Simulink seems like a major factor in this staying power.
Strongly disagree with this statement (and agree with the parent). In many hard engineering disciplines, appropriate solvers (e.g. SuiteSparse) are barely supported, if at all (e.g. scikit-sparse only supports CHOLMOD, and UMFPACK is supported via another package).
People overestimate how many people are willing to work on the "wrap C numerical library in Python" problem. On the other hand, Mathworks employs many people to work on things like mldivide. At least in the SuiteSparse case, the first class citizen is MATLAB.
Sounds like a niche to me, with lots of options (wrapping C numerical libraries in python) for people that don't want to be locked in.
When I was in school 15 years ago, matlab was pretty ubiquitous.
Maybe it was to strong to say it's irrelevant as opposed to niche, though it's definitely irrelevant in many fields where it used to be king. I do miss the figures though, I liked the combination of programmatic formatting + manual tweaks.
What exactly do you call a niche? Entire industries are being propped up using it. What happens is because the solvers are written in these things, engineers start building everything with matlab. I have personally made networking implementations, webservers, Apache Arrow clone etc. It's really not as niche as you think it is. For mostly worse, this language is everywhere.
Matlab is a lot faster than Python. At least an order of magnitude, once you move out of calls to LAPACK et al. It has been improved quite a bit on that front.
So Python is often not a realistic alternative. This is where initiatives like Mojo come in. They will be at least an order of magnitude faster than Matlab again.
But if you want to get people to move from Matlab to a powerful Open-Source alternative: Julia has a syntax that is much closer to Matlab than Python's. I had good success to get colleagues to use Julia which wouldn't look at Python, because the syntax was too far out of their comfort zone.
Yes, we expect that Mojo will be open-sourced. However, Mojo is still young, so we will continue to incubate it within Modular until more of its internal architecture is fleshed out. We don’t have an established plan yet.
Why not develop Mojo in the open from the beginning?
Mojo is a big project and has several architectural differences from previous languages. We believe a tight-knit group of engineers with a common vision can move faster than a community effort. This development approach is also well-established from other projects that are now open source (such as LLVM, Clang, Swift, MLIR, etc.)."
The problem with this is that you don't know what the licensing will be when/if they open source it. So you could start using it now and they might change their mind and choose not to open source it or it might be a restrictive license.
Also, even if fully open source, have to also look at governance.
(A canonical project can be steered in a way that makes it impractical for you, and forking is often also impractical.)
For now, I'd treat it as closed source, which is a non-starter for investing in, when I can accomplish the same in open source ways. And there's no sense in giving away the open source uptake benefits to a company when the software isn't open source.
Their response to why it's faster to move while closed source seems kinda like a false dichotomy. They can choose to start with it open disallow input from the community or close their issue tracker to the community.
One that pays expensive senior full time SWEs to build it does, yes. Every programming language uses one of two models:
1. Patronage
2. Business
Patronage is the Ruby/Python/Linux "one guy + volunteers" model where a few of the devs look for companies to pay them to do it as a full or side project basically for marketing purposes or because that company can afford to subsidize their tools. This clearly isn't that.
A VC backed startup making an open source language is neither patronage nor a business, unless you take the hard-cynic position of saying the investors have been tricked into being patrons.
I'm assuming they've got some sort of cloud related ideas for monetization, but it's tough.
One of the biggest problems with Python is packaging, and one of the biggest problems with Python packaging is installing scientific computing modules that rely on C/C++/Fortran. Will Mojo attempt to address these problems?
Yes this must be fixed. Python packaging is such a nightmare I use a fresh docker image for every project, which is much easier than messing with pyenvs, conda, broken wheels and that whole mess.
Are you afraid that you're going to "inherit" the issues the python ecosystem has through your goal of full compatibility? Or is Mojo more like Numba, in that only parts of python will actually be supported for full acceleration? At least your docs[1] seem to say so..
Agreed. I think this effort is completely missing the real pain points that ML suffers from.
While python the language is easy, and in many ways great for its original purpose as a teaching language, I'll take note of the few ways that Python ML suffers:
- pip hell. Really, having globally installed dependencies was great for the 90s and is terrible now that disk space is more or less a non-issue relative to dependencies. Venv/conda which do sneaky things e.g. with your shell is super dangerous (https://twitter.com/garybernhardt/status/1653171980483575808), and a misstep can trash your system especially when it has to deal with wheels with system-level dependencies (looking at you, tensorflow -- probably half of the reason why people moved to pytorch). Poetry sounds nice. It's been a while since I've checked in with the python ecosystem. Are ML people using that yet?
- Subpar deployment. Let's remember that Containerization basically exists because Python does not have an ops story.
- Subpar integration with web. You are forced to either create a microservice, or, spin it up within Django (nobody really does this). Then you typically have to pull in a bunch of sidecar processes (Redis, Celery, etc.) just to get queuing of your web jobs correct.
- Poor concurrency. Sure, you can run your tensorflow code in an awkward 'with' statement but I think there are very few ML practicioners who could really explain to you what that with is doing. That GPU is actually fundamentally an asynchronous entity. And god help you if you want to run and debug async python.
- No distribution story. Sure, the big guys are able to spin up, e.g. Horovod, but it's not really a thing for someone with less resources for a hot second on a few machines, and again, god help you if something goes wrong and you need to debug it.
Does Mojo solve any of these issues? From a cursory look, it looks like no.
I guess their argument is the reason those things suck is because they are hooking in some c++ monstrosity (tensorflow) or making rpcs to an external daemon (redis, horovod).
So rather than writing another Python wrapper over c++ they are making a new performant language that can call Python.
To me it makes sense as torch is great and hard to compete with, but everything feeding into it is a mess today (Data loading, distribution logic).
don't forget the control layer/data layer separation principle. Performance mostly only matters at the data layer, and I don't believe that python ML really has a substantial problem with this, aside from not having a real distribution story. So "having a more performant python" doesn't really solve that much.
I'll tell you what could make the control layer better.
- no gil
- better async primitives
- immutability of passed parameters
- better testing story
- better documentation story (python is quite good at documentation, well, when python devs actually do it, which they usually don't).
Could you explain or give references to what you exactly mean by this? I've heard of separation of concerns, but is this a specific realization of that principle?
It might make the issues worse, actually, if you end up with a language that promises full support for python as a subset, but indefinitely doesn't actually do that, and is nevertheless encouraging people to import python libraries. Then it's like a wrapper around a wrapper around a dependency tangle...
This post has me interested in mojo a lot and it has a lot of potential but it's difficult for me to get too excited about it at the moment because so much of it doesn't actually exist at the moment. Nothing is open sourced yet, and from the docs it seems they don't even have classes implemented.
My experience with new languages is that the devil is often in the details and the stuff that gets put off is sometimes where the sticking points are, where performance starts to decline relative to other languages, and where you start to run into dependency hell. It's not so much I want mojo to fail or anything — the contrary in fact — but it's so hard to know where it will end up this early in its development.
Congratulations on the launch preview! This is exciting, as I've been waiting for this announcement since Jim Keller's callout on Lex Fridman podcast two years ago[0].
I'm curious about the teasing around open source. Obviously the amount of money and the slick product launch dictate a need to capitalize on this pooled expertise. Don't want to give away the game to the hyperscalers.
I wonder what the revenue and license models are going to end up looking like. A cloud of their own, professional services to HW manufacturers to optimize their performance, professional services to hyperscalers?
Curious if anybody has any ideas beyond the obvious.
Initial impression is that the animation of Mojo code vs. Python code has a bad UX. Why not just show the code side-by-side instead of animating it and making me click?
Another obvious question is how is it different than Numba and so forth?
The Mojo language has lofty goals - we want full compatibility with the Python ecosystem, we would like predictable low-level performance and low-level control, and we need the ability to deploy subsets of code to accelerators. We also don’t want ecosystem fragmentation - we hope that people find our work to be useful over time, and don’t want something like the Python 2 => Python 3 migration to happen again. These are no small goals!
and
Mojo already supports many core features of Python including async/await, error handling, variadics, etc, but… it is still very early and missing many features - so today it isn’t very compatible. Mojo doesn’t even support classes yet!
So I think the idea is good, but yeah re-implementing Python is a huge effort !
Though the comparison right below is notable:
A major goal of Clang was to be a “compatible replacement” for GCC, MSVC and other existing compilers. It is hard to make a direct comparison, but the complexity of the Clang problem appears to be an order of magnitude bigger than implementing a compatible replacement for Python. The journey there gives good confidence we can do this right for the Python community
> A major goal of Clang was to be a “compatible replacement” for GCC, MSVC and other existing compilers. It is hard to make a direct comparison, but the complexity of the Clang problem appears to be an order of magnitude bigger than implementing a compatible replacement for Python. The journey there gives good confidence we can do this right for the Python community
Is it though? For Mojo to be a compatible replacement for Python, it would need to match the Python C ABI and the greater set of the standard library in a bug-for-bug compatible way.
The real question is whether it's bug-for-bug compatible or whether it's not. Numerical functions have a lot of nuances that effect that performance quite a bit. Are they constrained so that Python numpy log(x) gives the same as Mojo log(x)? Will C bindings act the same way as in CPython? Whole list of related questions. If there is a no to any of these questions, then code acts subtly differently in a way that is sometimes hard to detect. These differences are of course what have held back "standard code" from being numba/pypy/etc. compatible in many instances.
That said, if the answer isn't yes to anything, then it is very difficult to make optimizations. Not allowing hard to optimize Python behavior is precisely what has allowed Numba, Julia, etc. to achieve accelerations. There were some attempts at that kind of thing with R, which Jan Vitek gives some very interesting talks about (https://www.youtube.com/watch?v=VdD0nHbcyk4).
What I would find worrisome too is that this approach sounds like compile time city. A lot of the recent advancements in Julia have been by sending less to LLVM: optimizations are done in Julia and dead code is eliminated, calls are found to be the same and check caches, and then with v1.9 those caches use precompiled binaries to avoid having to call LLVM again. And the timeline of improvements shows that making LLVM be in the picture as little as possible has lead to some dramatic improvements in Julia's latency (https://viralinstruction.com/posts/latency/). Given what was seen from that, I'm weary of an approach that does everything in LLVM (via MLIR) on an even more dynamic representation (i.e. Python). My guess is that only things with explicit types will compile, and the rest is probably hitting some Python interpreter to avoid this issue.
[If part is using Python/GC through an interpreter though, wouldn't that part be harder to target to accelerators since it wouldn't compile through LLVM? That would mean only the code that is explicitly typed gets the nice new features, but not the Python parts?]
But hey, if this gets Chris Lattner and crew a reason to start taking LLVM compile times more seriously, then it's a win for Julia as well. I'm excited to see how the communities can benefit from one another, especially if Mojo is open source then it can be a win for all (which was definitely not clear in the presentation). And I do think that some of the ideas of lower level memory control are cool and should be added similarly to languages like Julia and Numba.
Thanks for your interest, I'm pretty confident we can do this. I've been working on compilers and languages for awhile. :)
Your point about LLVM compile time is great one. Mojo is architected from the beginning for fast compile times, including deeply integrated caching and distributed compilation. LLVM "isn't slow" if you don't keep asking it to do the same thing over and over again.
> Is it though? For Mojo to be a compatible replacement for Python, it would need to match the Python C ABI and the greater set of the standard library in a bug-for-bug compatible way.
Isn't this going backwards? If Mojo delivers what it is trying to do, then these libraries would be rewritten in Mojo, and the issue of having to use two languages and continuously switching between them will be avoided.
I saw your talk in London a few weeks ago and immediately thought of your discussions on python accelerators when I heard about Mojo - glad to see your thoughts here!
The fire emoji as file extension makes me feel very uncomfortable. Am I old now? (I know I don't have to use it, but I usually work in big projects with lots of other people's code).
I'm not personally a fan of emoji, but I do like using and typing more than plain ASCII in the terminal. (One thing I liked about Julia is their partial embrace of Unicode, and one thing I don't is that you can't write lambdas with ‘↦’.)
This is the final nail in the coffin for "Julia as a replacement for Python" in my eyes. Maybe I'm very late to that conclusion, maybe the writing's been on the wall for a while, but I still held out hope that Julia could be at least a parallel peer to Python sharing the market.
But now, it seems almost certain that Julia will end up a possible replacement-for-Matlab niche language, unknown and unused by most outside the niche. It's a pretty big and important niche, to be sure, but a bit disappointing given how nice the language is.
Predicting language ecosystem development is difficult, especially if its about the future. It does feel that Julia may have missed its window of opportunity. But maybe not.
On the one hand Python's mindshare has been growing exponentially, riding on successive waves of data science, machine learning, deep learning, AI and now AGI hypes. NB: The two hypes it did not benefit from (for obvious reasons) are big data and crypto/blockchain. Given, though, Python's heavy historical baggage, you could think that eventually gravity would re-assert itself - with a potential crash landing.
So in a sense, if the mojo project succeeds becoming a very broad based renewal effort (a Python 4 thing) that fixes some of Python's limitations, it will lock-in Python's current amazing popularity. If not, then the field is still open for a challenger and Julia could well be that.
The broader technology space feels very febrile right now. Lots of talk, much less walk. The winners will be simply those who deliver tangible "next-gen" experiences to developers and end-users.
I would like to know this as well. Not just “can you write general purpose programs” but is that one of the fundamental design goals. For example, in my personal opinion Julia does not meet the bar.
One of Python’s powerful features outside of being a great glue language for AI is rich runtime reflection and introspection. This is what allows things like FastAPI/Pydantic to work. Will Mojo support this level of runtime introspection?
I’m also curious about the type system. Will it be Python level or TypeScript level?
It is a little disappointing that they're setting the bar against vanilla Python in their comparisons. While I'm sure they have put massive engineering effort into their ML compiler, the demos they showed of matmul are not that impressive in an absolute sense; with the analogous Julia code, making use of [LoopVectorization.jl](https://github.com/JuliaSIMD/LoopVectorization.jl) to automatically choose good defaults for vectorization, etc...
julia> using LoopVectorization, BenchmarkTools, Test
function AmulB!(C,A,B)
@turbo for n = indices((C,B),2), m = indices((C,A),1)
Cmn = zero(eltype(C))
for k = indices((A,B),(2,1))
Cmn += A[m,k]*B[k,n]
end
C[m,n]=Cmn
end
end
M = K = N = 144; A = rand(Float32, M,K); B = rand(Float32, K,N); C0 = A*B; C1 = similar(C0);
AmulB!(C1,A,B)
@test C1 ≈ C0
2e-9*M*K\*N/@belapsed(AmulB!($C1,$A,$B))
96.12825754527164
I'm able to achieve 96GFLOPs on a single core (Apple M1) or 103 GFLOPs on a single core (AMD EPYC 7502). And that's not even as good as what you can achieve using e.g. TVM to do the scheduling exploration that Mojo purports to do.
Perhaps they have more extensive examples coming that showcase the capabilities further. I understand it's difficult to show all strengths of the entire system in a short demonstration video. :)
EDIT: As expected, there are significantly better benchmarks shown at https://www.modular.com/blog/the-worlds-fastest-unified-matr... so perhaps this whole discussion truly is just a matter of the demo not showcasing the true power of the system. Hopefully achieving those high performance numbers for sgemm is doable without too much ugly code.
Yeah I think no one will likely have any edge for a simple thing like a matrix multiplication since all the right abstractions are supported in both languages and they end up in the LLVM code gen. Having Python 3 backwards compatibility and easily deploying your code to, say, phones via a C++ API is quite big though.
It seems you are using N=144 where in the modular example they are doing N=1024. Significantly less computationally expensive of a calculation in this Julia example.
I'm sure there are reasons for it, but Chris Lattner has been jumping around a bit. Remember swift4TF.
But hopefully this one is seen through with lots of open source too.
but Mojo seems to keep all of that numpy cruft that's there because Python. (numeric computing was bolted onto python via numpy, but it's built-into Julia)
This looks exciting, and I'm looking forward to its progress. I have questions about GPU infrastructure in particular. It seems like the GPU backend is primarily through MLIR. Does that target CUDA? Vulkan compute shaders? WebGPU? Metal shaders? I note that IREE has the first three of those listed as targets, but am not clear on exactly where that fits in to the rest of the ecosystem.
I have an idea of a dream GPU infrastructure based on compute shaders, where you compile your problem into GPU IR suitable for your operating system (SPIR-V, Metal, DXIL) and have a very lightweight runtime that just runs those. For the ahead of time compilation case, you wouldn't need to ship GPU compilers in a deployed application, though you would want that for more rapid iteration when doing research and exploratory programming.
From what I can see so far, this seems basically orthogonal to what Modular and Mojo are trying to do, but it's entirely possible I'm missing something. I'm wondering if anyone is actually building this (IREE and MediaPipe are the closest things in the space I'm aware of), and if not, why not.
You can lift to various dialects in MLIR, for example, there is a Metal dialect, but it's not actively maintained. So, while it's possible, the question is whether Modular will add support for it.
Is there a sample or tutorial that jumps right into typical examples of what's different or special about it? I tried to find such, but the docs get lost in syntax details and other minutia. I want the ADHD-meat-and-potatoes tutorial. Maybe that's asking too much, but I did it anyhow, as any ADHDer would.
Impressive. I think this has a real chance of being successful. Backwards compatibility with Python and what appears to be a strong focus on improving what makes Python frustrating to work with (let/var, no GIL, compiled) all looking promising. I'll definitely be keeping a close eye on this and look forward to getting access to the playground.
Swift already imported Python code in a similar fashion. Feels like this is a more Pythonic syntax for Swift and likely carries over all the underlying goodness.
[FWIW, folks waiting for this should also look at Cython, which is different, but uses Pythonic syntax for more of a C-like semantics.]
Swift doesn't do a good job of interleaving with Python (i.e. allowing Python to call into Swift, while possible, is nothing like Numba due to various object type translation required).
Mojo seems to be targeting Mojo -> Python -> Mojo too (i.e. Mojo can do high level control flow, delegate some more control flow / unsupported ops to Python, then implement some accelerator supports that Python will call back to). This can be quite difficult if you want to have very low bridging cost (Python objects are quite large and different libraries, such as Python / numpy have different representations in C on top of these Python objects).
It is all possible (after all, we are doing computer stuff), but it is a quite difficult path comparing to other success interop stories (Swift / ObjC took a decade to achieve somewhat low bridging cost, Kotlin / Java simply gives up and doing everything in JVM).
Nvidia has a ton of lock-in with CUDA. Seems like it could shake up the GPU industry if this takes hold as a standard tool in ML. Also, I'd imagine AMD, Intel, Apple, and others will be lining up at the door to sponsor this project.
Can this produce a static AOT compiled binary ? This seems to be currently one of the biggest limitations to Julia (and even Python) in terms of distribution, especially in the embedded space. PackageCompiler.jl helps, but binaries are still huge
Julia's binaries are currently 50mb to 1gb depending on what you put in them. The large floor comes from the fact that you have to ship LLVM which is around 50 MB by itself. (the 1gb binaries are if you include a couple thousand packages in the binary also).
A magnificent game changing project once again by the creator(s) of LLVM. I think this time they might have a proper Python alternative. Many have tried like Julia Lang, R, C#, Swift, Rust, Haskell, etc and all have failed in competing against Python.
Perhaps this time, we finally have one that is a proper replacement for anything requiring intensive compute and performance without being a systems programmer, all thanks to the finest compiler engineers who brought you LLVM, MLIR and now Mojo. (Not the AI hype squad sitting on O̶p̶e̶n̶AI.com's APIs.)
If you know Python you have learned 90% of Mojo. Just waiting to see if it can compile to a binary.
If that works with any libraries built on the CPython C API it would be worth calling it out in the docs - it currently reads like pure python or numpy work, but I assumed that was because you'd reimplemented numpy, not managed to dlopen it (and worked around those libraries relying on the GIL).
> We utilize CPython to run all existing Python3 code “out of the box” without modification and use its runtime, unmodified, for full compatibility with the entire ecosystem
@chrislattner out of curiosity, can you share on which CPU the numbers in https://docs.modular.com/mojo/notebooks/Matmul.html are obtained and/or the fraction of peak performance on that machine? Speedups over naive Python feel kind of strawman. I can see a 3500x improvement over that with practically vanilla MLIR, similar schedule, and no autotuning.
It's a superset of Python. What you call "Python" is actually CPython, a C implementation of the Python language. Many other Python implementations exist, though they typically don't support all packages well (for example lack of support of native Python extensions like what Numpy relies on). It looks like Mojo has its own runtime (so not CPython) but also packages CPython for compatibility?
I've only skimmed through the docs so far, and it sounds interesting, but..
> The Mojo standard library, compiler, and runtime are not available for local development yet, so we created a hosted development environment where you can try it out. We call it the Mojo Playground!
Is this going to be a SaaS-style programming language, or will it be open source and local later? I'm not sure I understand why a new programming language wouldn't start off as open source, or at least able to be used locally?
Wow. Could someone do this for Typescript/JavaScript?
Like the language is full interop with the existing JS ecosystem, but if I opt into certain restrictions (and run on some novel runtime), I get an insta-performance boost?
Personally wonder if the Typescript team themselves could drive this short of evolution, albeit I know that lately they're very committed to being "just types" and nothing that affects runtime. Which I get.
Very cool development. There is too much busy work going from development to test to production. This will help to unify everything. OpenAI Triton https://github.com/openai/triton/ is going for a similar goal. But this is a more fundamental approach.
Fellow Julia user. Quite excited about Mojo. Been following Chris from his MLIR days at Swift 4 TF and his course with Jeremy on fastai. I hope we can also derive some exciting learning from Mojo :)
Sounds really exciting. If you can carry out your ambitions, and avoid all the pitfalls of licensing and community building, this might be the next big thing in the field of data science.
Do you have an ambitioned release date? Maybe in 2024?
> Is it because of the language design, or because of its implementation?
Bit of both? The language expects properties that lend themselves to having a GIL (i.e. attempts at removing it from CPython have turned out to make it slower in many cases), but it's not impossible for an implementation that does more advanced analysis to be able to figure out cases where it isn't needed.
Code written to massively parallelize will want to/have to keep accesses inside the thread context anyways, and thus won't hit cases the GIL serves, and if you allow language extensions they can make that explicit where needed.
That video had really high production quality making it seem like Modular.com is very well funded. What is their source of funding, what’s going to be their business model?
From what I can tell:
1. They have a REPL and Jupyter integration
2. They're planning to do C and C++ interop. This is achievable in Nim but it is a pain, i.e. there is not easy way to import/include a C header in Nim, e.g. nothing as straight-forward as @cImport("cheader.h") in Zig or #include "cheader.h" in C++; one must go through c2nim or a third-party library
(1) cannot be achieved without a custom linker or using a JIT such as LLVM or compiling to WASM and embedding a WASM runtime into the binary
I really hope Nim gets some language/compiler upgrades due to this.
What's the question? They are sharp and pointy and Mojo has an unsafe `Pointer` struct vended by the standard library for folks who know what the they are doing.
That said, Mojo is a completely different thing. It is aligned with the Python community to solve specific problems outlined here: https://docs.modular.com/mojo/why-mojo.html
Mojo also has a bunch of technical advancements compared to Julia by virtue of it being a much newer development and being able to learn from it (and Swift and Rust, and C++ and many many other languages). Including things like ownership and no GC. We also think there is room for a new language that is easier to deploy, scales down to small envelopes, works directly with the full Python ecosystem, is designed for ML and for MLIR from first principles, etc.
Julia is far more mature and advanced in many ways. Many folks have and will continue to push Julia forward and we wish them the best, it is a lovely ecosystem and language. There is room for more than one thing! :)
EDIT: Just in case there is any confusion, I work for Modular, built LLVM, Swift, Clang, MLIR and a variety of other things. I wasn't trying to misrepresent as being unaffiliated.