We’re on macOS using docker, which is basically Linux, and building numpy et al is a pain. Sometimes we have failures on our build pipelines (Linux), sometimes on our dev machines, and it always takes forever.
We do a little of everything:
signal processing
machine learning - to optimize our simulations
heavy math for simulations (lots of trig)
work with 3D models (we use Blender’s bpy)
That said, we won’t be paying for pyx. Our problems can be solved via building our dev docker images in our pipelines and just pulling them down for devs, we just haven’t bothered because the above is somewhat rare.
I don’t know who this is for, because rebuilding numpy et al should be somewhat rare, and the annoyance is usually only a few min when it happens.
MacOS and docker is asking for trouble. It requires all kinds of hacks and awareness of where things are being built or mounted, which architecture is being emulated because it runs in a VM, memory limits, swapping, and so on. It’s no surprise build times are a problem.
I imagine your build pipeline is for aarch64, you have made modifications to numpy itself (build flags or something), you’re using a special distro in the pipelines, or the rebuild is unnecessary. For me numpy has always been a simple “pip install numpy” or “poetry install” when not on nixOS.
My guess is that if you used non-emulated Linux, your local build issues would be reduced and also sped up. But I can’t say that with any certainty because I don’t have nor require nvidia in my Linux rig, nor do I know your setup very well.
You do however fall quite nicely into the criteria I shared. Were you outside of that, python would probably be much much easier (aka no non-pytjon deps).
Nope, build pipeline is for x86 since that’s what we deploy on, and we use Debian in the containers and Ubuntu for the host. We’ve talked about switching to aarch64, but as of yet that hasn’t happened.
And I could be misremembering and it was actually scipy that caused issues, I get the two mixed up. We tend to delay bumping versions, and usually the fix is to bump the patch version or something.
no non-pytjon deps
The thing is, Python is designed as a glue language. You’re supposed to use native code to speed up the slow parts, that’s the whole point.
Here’s an old paper from Guido van Rossum about Python as a glue language, which specifically calls out its extensibility via C/C++. From my understanding, the general philosophy of Python is to first make it obviously correct, and then optimize the parts that are too slow w/ native code.
And that’s generally how we use it. The majority of our codebase is in regular Python, with some dependencies (i.e. numpy, scipy, torch) in native code because Python is too slow, and then some larger chunks that are 100% native code because the interface between Python and C++ is too slow for complex simulations. But outside of those corners of our application, we have hundreds of thousands of lines of simple Python code spread across a bunch of microservices. If Python didn’t offer that easy native-code extensibility, we’d use a different language.
The link you posted, while interesting, doesn’t substantiate the claim that Python was “designed” to be glue code. It is a nice side effect due to its simple yet powerful syntax and extensibility using CPython.
That non-python dependencies being a problem is not surprising and I don’t think any language has really solved that. C/C++ doesn’t have a dependency system, no package or library registry nor manager, no universal build system, and no version manager. Building anything of that language alone is a chore. Rust has to be present on the system and in the right version. D, Zig, Go, and others too.
If you’re on Linux there’s a good chance the library you want to use is already compiled for you and you can just download it as part of the Python package installation. Building it on linux is peeking into Pandora’s box. Building it on non-linux is opening Pandora’s box. The only non FHS Linux I’ve encountered to solve the dependency problem somewhat is NixOS.
IMO to complain about non-python dependencies is born in ignorance of other build systems. If they can’t solve their own problems, it’s a tall order to demand it of Python’s packaging system. Python isn’t perfect, of course, but the problems we’re talking about here aren’t python problems, IMO.
That paper was written a few years after Python was released, and before 2.0 was a thing. The article says it’s already being used that way in industry, and reasons why vs just doing native code from the start (faster dev time and whatnot). The way module loading and the whole standard library works facilitates native extension.
His stated goals for Python were generally:
clean code for the interpreter
clean syntax to make intent clear
easy to learn and use
They go so far as to reject changes that significantly increase performance if it complicates the code. Why? The simplest explanation is that if you need the speed, building your own native extension is the way to go. That was the way I’ve been told to use Python: write in Python, and if it’s not fast enough, rewrite the slow parts in C/C++.
LuaJIT is an example of the opposite approach, where the interpreter is designed to be fast enough that you generally don’t need to use native extensions, and they prioritize performance over features, unlike Python. As a result, community contributions are very limited, and many people stick with the official runtime.
Python wouldn’t be very successful without native extensions, so I think it’s absolutely fair to say it was an early design decision.
That’s a real problem! MacOS and Docker are extremely common for development, and a scripting language like Python should just work with them, not just at the language level (which is good enough) but in the ecosystem and tooling. The latter is what Astral is trying to improve.
I’m not talking specifically Python. Just macOS and docker. That combination has proven time and time again to be a massive timesink in every place I’ve seen it used. Devs can spend days (not joking, I’ve seen this happen multiple times) getting it setup for whichever purpose.
It went from all kinds of fancy wrappers around QEMU to claims that Rosetta and Intel docker would work, to Apple’s VMs, docker’s proprietary VM (after they dropped qemu), Ubuntu’s Multi-pass, and that’s where I stopped following the news. Every single one of these solutions claimed to be the best and they all ran into such massive problems that they needed hacks and workarounds and “novel” solutions.
And these weren’t even for niche or hardware related stuff. Simply mounting the repository’s folder or mounting a node_module folder absolutely killed performance on Macs because it wasn’t direct hardware access or filesystem access but everything had to go through a socket or some sort from the VM to Mac. Anything with many files or file access completely tanked: java, Kotlin, C/C++, Rust, anything compilation, JavaScript and it’s modules, Perl too.
One would expect the logical conclusion to be “Mac isn’t the best development platform for Linux deployables” but instead a few glossy, rounded corners later and people cling on to it and then blame the tools for not adapting to Mac hacks.
Well, that’s fair; first-class support for running an actual Linux kernel, with the containerization support that implies, is one of the reasons I prefer Windows with WSL as a dev environment when using Docker (at least, in companies where the IT team won’t just let you run native Linux).
We’re on macOS using docker, which is basically Linux, and building numpy et al is a pain. Sometimes we have failures on our build pipelines (Linux), sometimes on our dev machines, and it always takes forever.
We do a little of everything:
That said, we won’t be paying for pyx. Our problems can be solved via building our dev docker images in our pipelines and just pulling them down for devs, we just haven’t bothered because the above is somewhat rare.
I don’t know who this is for, because rebuilding numpy et al should be somewhat rare, and the annoyance is usually only a few min when it happens.
MacOS and docker is asking for trouble. It requires all kinds of hacks and awareness of where things are being built or mounted, which architecture is being emulated because it runs in a VM, memory limits, swapping, and so on. It’s no surprise build times are a problem.
I imagine your build pipeline is for aarch64, you have made modifications to numpy itself (build flags or something), you’re using a special distro in the pipelines, or the rebuild is unnecessary. For me numpy has always been a simple “pip install numpy” or “poetry install” when not on nixOS.
My guess is that if you used non-emulated Linux, your local build issues would be reduced and also sped up. But I can’t say that with any certainty because I don’t have nor require nvidia in my Linux rig, nor do I know your setup very well.
You do however fall quite nicely into the criteria I shared. Were you outside of that, python would probably be much much easier (aka no non-pytjon deps).
Anti Commercial-AI license
Nope, build pipeline is for x86 since that’s what we deploy on, and we use Debian in the containers and Ubuntu for the host. We’ve talked about switching to aarch64, but as of yet that hasn’t happened.
And I could be misremembering and it was actually scipy that caused issues, I get the two mixed up. We tend to delay bumping versions, and usually the fix is to bump the patch version or something.
The thing is, Python is designed as a glue language. You’re supposed to use native code to speed up the slow parts, that’s the whole point.
That’a news to me. Got a source?
https://sh.itjust.works/comment/20405080
That’a news to me. Got a source?
Here’s an old paper from Guido van Rossum about Python as a glue language, which specifically calls out its extensibility via C/C++. From my understanding, the general philosophy of Python is to first make it obviously correct, and then optimize the parts that are too slow w/ native code.
And that’s generally how we use it. The majority of our codebase is in regular Python, with some dependencies (i.e. numpy, scipy, torch) in native code because Python is too slow, and then some larger chunks that are 100% native code because the interface between Python and C++ is too slow for complex simulations. But outside of those corners of our application, we have hundreds of thousands of lines of simple Python code spread across a bunch of microservices. If Python didn’t offer that easy native-code extensibility, we’d use a different language.
deleted by creator
The link you posted, while interesting, doesn’t substantiate the claim that Python was “designed” to be glue code. It is a nice side effect due to its simple yet powerful syntax and extensibility using CPython.
That non-python dependencies being a problem is not surprising and I don’t think any language has really solved that. C/C++ doesn’t have a dependency system, no package or library registry nor manager, no universal build system, and no version manager. Building anything of that language alone is a chore. Rust has to be present on the system and in the right version. D, Zig, Go, and others too.
If you’re on Linux there’s a good chance the library you want to use is already compiled for you and you can just download it as part of the Python package installation. Building it on linux is peeking into Pandora’s box. Building it on non-linux is opening Pandora’s box. The only non FHS Linux I’ve encountered to solve the dependency problem somewhat is NixOS.
IMO to complain about non-python dependencies is born in ignorance of other build systems. If they can’t solve their own problems, it’s a tall order to demand it of Python’s packaging system. Python isn’t perfect, of course, but the problems we’re talking about here aren’t python problems, IMO.
Anti Commercial-AI license
That paper was written a few years after Python was released, and before 2.0 was a thing. The article says it’s already being used that way in industry, and reasons why vs just doing native code from the start (faster dev time and whatnot). The way module loading and the whole standard library works facilitates native extension.
His stated goals for Python were generally:
They go so far as to reject changes that significantly increase performance if it complicates the code. Why? The simplest explanation is that if you need the speed, building your own native extension is the way to go. That was the way I’ve been told to use Python: write in Python, and if it’s not fast enough, rewrite the slow parts in C/C++.
LuaJIT is an example of the opposite approach, where the interpreter is designed to be fast enough that you generally don’t need to use native extensions, and they prioritize performance over features, unlike Python. As a result, community contributions are very limited, and many people stick with the official runtime.
Python wouldn’t be very successful without native extensions, so I think it’s absolutely fair to say it was an early design decision.
That’s a real problem! MacOS and Docker are extremely common for development, and a scripting language like Python should just work with them, not just at the language level (which is good enough) but in the ecosystem and tooling. The latter is what Astral is trying to improve.
I’m not talking specifically Python. Just macOS and docker. That combination has proven time and time again to be a massive timesink in every place I’ve seen it used. Devs can spend days (not joking, I’ve seen this happen multiple times) getting it setup for whichever purpose.
It went from all kinds of fancy wrappers around QEMU to claims that Rosetta and Intel docker would work, to Apple’s VMs, docker’s proprietary VM (after they dropped qemu), Ubuntu’s Multi-pass, and that’s where I stopped following the news. Every single one of these solutions claimed to be the best and they all ran into such massive problems that they needed hacks and workarounds and “novel” solutions.
And these weren’t even for niche or hardware related stuff. Simply mounting the repository’s folder or mounting a node_module folder absolutely killed performance on Macs because it wasn’t direct hardware access or filesystem access but everything had to go through a socket or some sort from the VM to Mac. Anything with many files or file access completely tanked: java, Kotlin, C/C++, Rust, anything compilation, JavaScript and it’s modules, Perl too.
One would expect the logical conclusion to be “Mac isn’t the best development platform for Linux deployables” but instead a few glossy, rounded corners later and people cling on to it and then blame the tools for not adapting to Mac hacks.
Anti Commercial-AI license
Well, that’s fair; first-class support for running an actual Linux kernel, with the containerization support that implies, is one of the reasons I prefer Windows with WSL as a dev environment when using Docker (at least, in companies where the IT team won’t just let you run native Linux).