Nobody writing NN in Python, they are just describing it.
For NN or DL in general, the correctness doesn't really lie too much on the code quality level, like ownership Rust people love to talk about. It is more about Numeric stability under/overflow and such. Choice of programming language offers limited help here.
I don't think Rust has a killer app for ML/DL community to offer as of now, the focus is vastly different.
I've had a few Rust lovers come and mention this project to me recently. None of them had any data science or ML experience. None of them knew that Python is just used to define the high level architecture.
At the same time, comparatively tedious languages like Rust will never attract data science practitioners. They don't care about the kind of safety it brings, they don't care about improving performance in a component that's idle 99% of the time.
The bulk of the load in an DL workflow is CUDA code and sits on the GPU. Even the intermediate libraries like cublas would only see marginal-to-none benefits of being reimplemented in rust.
This is a cool project, but it has no chance to displace or even complement Python in the data science space.
I think the industry is moving to 'MLIR' solution (Yes, there is a Google project called exactly that, but I am referring to the general idea here), where the network is defined and trained in one place, then the weights are exported, delegated to optimized runtime to be executed.
If such trend furthers down, then there will be very little reason to replace Python as the glue layer here. Instead it will become that everything is training in Python -> exported to shared format -> executed in optimized runtime, kind of flow.
Rust's opportunity could be to replace C++ in this case. But do mind that this is also a competitive business, where the computation is pushed further down to the hardware implementations, like TPUv1 and T4 chips etc.
But AOT/JIT compiled languages that can naturally talk to the GPGPU, without 2nd language syndrome, like Julia, Swift, Java and .NET will certainly be more attractive to data science practitioners.
I can already envision those life science guys that migrate to VB.NET when they have outgrown their Excel/VBA code, to start playing with ML.NET.
Python already gets JIT compiled to CUDA[1] and there's an entire funded ecosystem built around python+gpgpu called RAPIDS[2] which is the future of the ML space by most indicators.
I don't see any other language even making a dent in the Python ecosystem without some kind of new killer feature that can't be quickly replicated in Python.
> comparatively tedious languages like Rust will never attract data science practitioners.
Well, fast.ai is using swift now.
... I think it's fair to say 'never say never'.
You're probably right, rust isn't really the sweet spot for this stuff, but its also a case that python has some down sides that are pretty severe, and well acknowledged.
The parent comment literally said data science practitioners don't care about speed or safety because the GPU is where all the real work happens; that's false, I've provided an example of it being false from a respected party. What do you want me to say?
eh, I give up. Believe whatever you want to believe.
> Because Swift for TensorFlow is the first serious effort I’ve seen to incorporate differentiable programming deep in to the heart of a widely used language that is designed from the ground up for performance.
> But Python is not designed to be fast, and it is not designed to be safe. Instead, it is designed to be easy, and flexible. To work around the performance problems of using “pure Python” code, we instead have to use libraries written in other languages (generally C and C++), like numpy, PyTorch, and TensorFlow, which provide Python wrappers. To work around the problem of a lack of type safety, recent versions of Python have added type annotations that optionally allow the programmer to specify the types used in a program. However, Python’s type system is not capable of expressing many types and type relationships, does not do any automated typing, and can not reliably check all types at compile time. Therefore, using types in Python requires a lot of extra code, but falls far short of the level of type safety that other languages can provide.
...But fast.ai is a Python library. Partially written in swift.
Validating the original point that nothing will replace python for DL applications any time soon but middleware will continue to be implemented in c++/rust/swift/whatever you fancy.
S4TF isn't the first and certainly not the last end to end non-python DL stack. It might be worth highlighting as an example if it ever reaches mindshare above the noise floor amongst those stacks.
> Our hope is that we’ll be able to use Swift to write every layer of the deep learning stack, from the highest level network abstractions all the way down to the lowest level RNN cell implementation. There would be many benefits to doing this...
Well, the tldr: you’re wrong.
The more approachable reading: python isn’t going anywhere, but people are looking at other things for more than just low level implementations with a python wrapper.
...it’s early days yet, who knows where things will go... but maybe do a bit more reading and have an open mind?
Yeah there is some tedium in general with Rust syntax. Semicolons, for example, feel old fashioned, and there's a lot of verbosity in things like unwrapping. It's a fine language and I like working with it, but there's a lot of details involved in Rust development which don't make sense for data scientists to worry about.
Python is easy to get started with, but once a project grows to any meaningful size, I would rather have a compiler which is giving me more correctness guarantees than Python is capable of.
IMO Swift strikes the best balance between strictness and productivity of any language I've worked with.
This is why I think Swift is a much better choice to replace or at least compliment Python than Rust. It has a modern, powerful type system and all the quality of life advantages which come with it, but it manages this with a lot more usability than Rust.
A well written swift framework almost becomes a DSL for the problem domain, which is a great property for a data science tool to have.
You're right, and... I'm working professionally using DL for computer vision, for robotics. We spend the majority of our time writing the business logic around the learned parts. To get all of that code bug free and fast is way harder in python that I expect it would be in Rust. Rust's ndarray ecosystem is immature, but Python's static type checking ecosystem is too (no stable mypy stubs for numpy ndarrays), as is Julia's AOT compilation story.
Numerical stability sounds like floating point artifacts are a problem, could a specification/verification language with more exact arithmetic semantics help?
As somebody who programs in both Python and Rust (and likes both languages) I think Rust's place would be parts of the code that have to be fast, and that you want to get right.
Calling Python code from Rust or Rust from Python is totally doable, and there is in my view no reason why you shouldn't use both in the use cases that suit them.
And the speed part is serious. Some guy once asked for the fastest tokenizer in any given language and my naive implementation came second place in his benchmark right after an stripped down and optimized C variant.
So using Rust for speed critical modules and interfacing them from easy to use Python libraries isn't exactly irrational.
Calling C (or C++) code from Python is not only totally doable, it's what all the popular libraries do. Furthermore, most ML stuff runs on the GPU, which also is C-like code.
Hence, Rust offers no performance benefit that isn't already there. It really only offers safety and modern language features, at the cost of being tedious to use.
This seems to be comparing a hand implemented neural network in python (and numpy) and one in rust. Even in this simple case, the author discovers that in the python case, most of the time is spent in non-python linear algebra libraries.
Most of the major deep learning frameworks for python (tensorflow, keras, torch, mxnet, etc) will not normally be spending the majority of their time in python. Typically, the strategy is to use python to declare the overall structure of the net, and where to load data from, and then the actual heavy lifting will be done in optimized libraries written probably in C++ (or fortran, I seem to recall BLAS used fortran).
I think BLAS is a spec rather than one library. So your version may or may not be fortran. I do think the original "reference implementation" was written in fortran, which is sometimes called "The BLAS library" but I think most BLAS you see in the wild are not that.
This is the strength of python, though, and one you cannot ignore. Python is old, and has fast C ops for everything you may want to accomplish. There's no shame in that method in benchmarks.
Neural network libraries (Tensorflow, Pytorch) have a C++ backend and a Python interface. Which is great - you get a performant compiled language as the backend and a flexible user-friendly language as the interface.
Rust vs Python is a weird question because in reality no one writes their own neural network with numpy, and no one expects Rust to act like an interpreted language suitable for data science workflows. It would be more apt to compare Rust and C++.
>> Is rust suitable for data science workflows? >> Right now I have to say that the answer is “not yet”. I’ll definitely reach for rust in the future when I need to write optimized low-level code with minimal dependencies. However using it as a full replacement for python or C++ will require a more stabilized and well-developed ecosystem of packages.
I'm not sure rust is really aiming to be something used for data science workflows. I'm not sure the community will be putting much effort into making this a reality.
Fine, but it seems reasonable for someone to check. It's much easier to make a decision about whether a programming language is what you want if you've got evidence from someone who has tried something similar to you, as opposed to just hearing people say "Language X is awesome because of unimaginably low-level (from my point of view) feature Y".
Rust seems great, to be honest, just not universally so. Nothing wrong with defining the boundaries.
There are people working on language features that will get Rust closer to parity with C++ for numerical computing, most prominently "const generics", which will make it more ergonomic to write numeric libraries (see C++'s Eigen) that use static array sizes. This will ultimately be important for how aggressively the compiler can optimize the code, via eliminating bounds checks, etc.
I think it's quite impressive actually that someone can pick up Rust and manage to out-perform Numpy in their first project. BLAS implementations are decades-long exercises in optimization.
In my own experience, Rust has been excellent for the more boring side of data science - churning through TBs of input data.
vs C++-14 ? Indeed most DL is in fact C++. The Pytorch recent C++ API is a must. As professionals in this industry, my colleagues and I have switched to full C++. I'd be interested in advantages of Rust vs C++ instead of Python (which truely in terms of performances is C in the background).
I’m a newbie in NN topic and feel surprised to hear that noone uses Numpy in actual NN implementations, although it’s written in C++ and highly optimized. Why is that ?
And, how about Gonum (Go equivalent) ?
Finally, I’m currently going through the deeplearning.ai program. I got one week left, and will experiment with building some apps. Which technical stack should I choose ?
Most software (including Python and Numpy and Go and pretty much every Rust program) runs on your computer's CPU. The CPU is good at running programs with a lot of different instructions and if-statements and loops and stuff.
But for neural networks, people often prefer to use special hardware like graphics cards, since graphics cards are really good at doing relatively simple math on many pieces of data at once. So they create special libraries like TensorFlow that can send commands to the graphics card instead of doing the math on the CPU. (And they don't use Numpy because even though it's highly optimized, it's highly optimized for CPUs, and graphics cards are a lot faster than CPUs at running neural networks.)
Numpy is at a too low level for applied NN implementations.
If I'm not doing research on new methods but want to build a model for a particular problem using well-known best practices, then all the custom code that my app needs and what I need to write is about the transformation and representation and structure of my particular dataset and task; but things like, for example, optimized backpropagation for a stack of bidirectional LSTM layers are not custom for my app, they're generic - why would I need or want to reimplement them except as a learning exercise?
That'd be like reinventing a bicycle, for generic things like that I'd want to call a library where that code is well-tested and well-optimized (including for GPU usage) by someone else, and that library isn't numpy. Numpy works at the granularity of matrix multiplication ops, but applied ML works at the granularity of whole layers such as self-attention or LSTM or CNN; which perhaps are not that complex conceptually, but do require some attention to implement properly in an optimized way; you can implement them in numpy but you probably shouldn't (unless as a learning exercise).
The main reason numpy isn't used in NN implementations is that it does not, natively speaking, have GPU support. Tensor structures in PyTorch and TensorFlow have the most solid backend support for GPUs (TPUs) and have a good amount of numpy's ndarray capabilities. There is recent work to put numpy on the same footing for deep learning. Check https://github.com/google/jax
Nice to see, a Julia implementation should be fun to do, but the point of the article is true:when you work with linear algebra, most of the time is spent in BLAS.
This approach I think is missing the point. You will write highly optimized libraries in Rust, and then use those in Python.
This is why Python has eaten the world. Not because its the best at any one thing, except bringing all those things together - at which it is unparalleled, and is unlikely to be surpassed anytime soon.
numpy, scipy, pandas, tensorflow all those have very little actual Python code, its c++ and even fortran here and there.
This whole Python vs Bla thing is just silly nonsense. I know Python and some Bla, and so should you. Tonight someone will release SuperFantasticNewThing implemented in Bla, tomorrow someone else will wrap that in Python, and tomorrow night the rest of us will use PySuperFantasticNewThing, and that's exactly how it should be.
I think you're being unnecessarily dismissive of discussions surrounding python's fitness, and I find your remark "that's exactly how it should be" confusing.
Python is an ok interface language, in that it's script-like, dynamically typed and simple to comprehend. It's popular, which makes on-boarding efficient due to the sheer volume of tutorials online. And, it has built up a large ecosystem, because of the last two points.
That said, it's naive to suppose that python is the currently-ideal or future-ideal interface language. It's just ok, plus it's popular.
For NN or DL in general, the correctness doesn't really lie too much on the code quality level, like ownership Rust people love to talk about. It is more about Numeric stability under/overflow and such. Choice of programming language offers limited help here.
I don't think Rust has a killer app for ML/DL community to offer as of now, the focus is vastly different.
reply