Hacker News new | comments | show | ask | jobs | submit login
Docker 0-Day Stopped Cold by SELinux (redhat.com)
233 points by jwildeboer 15 hours ago | hide | past | web | 133 comments | favorite





This post is incorrect. SELinux does not fully mitigate this issue. We recommend users update to 1.12.6.

I expect Red Hat to issue a retraction shortly. We notified them last night that this post was incorrect.

Source: Security at Docker.


Can you please take some time here to explain why their post was incorrect, with a brief technical explanation of why SELinux enforcement failed to stop the attack that exploits the particular vulnerability?

I realize you're busy, but it would be much more helpful than a curt statement that simply claims they are wrong.


Especially given Docker Inc.'s history of counterproductive Red Hat hostility.

This shouldn't be downvoted. Docker has been very openly history towards Red Hat in the past. To the point of openly mocking their developers at DockerCon.

Hi, I'm the founder of Docker. It's not my place to say whether comments should be downvoted or not, and I don't want to ignite teenage-drama arguments over who was mean to whom at recess - we get enough of that level of discourse on the US political scene these days.

But I think there is an interesting topic to address here, that we deal with a lot at Docker.

The problem in a nutshell: if you choose to take the moral high ground and only promote your products with positive marketing (and that is our strategy at Docker - you will never see any Docker marketing content criticizing competitors), you are vulnerable to bullying and unfair criticism by competitors who don't mind hitting under the belt. Then the question is: do you allow yourself to respond and set the record straight? Or would that just legitimize the criticism by bringing more attention to it? On the other hand, not responding is also risky because it emboldens the trolls to take more and more liberties with facts and ethics. This dilemma becomes more and more pressing as you become more successful and more incumbents start considering you a possible threat to their business. Some of these incumbents have been defending their turf for decades by perfecting negative messaging. Like one competing executive once told me - "we eat startups like yours for breakfast". This situation can be bad for morale also, because your team sees their work and reputation dragged in the mud, and can interpret their employer's silence as a failure to stand up and defend them.

The most perverse variation of this problem is when trolls start preemptively painting you as bullies. If that narrative sticks, then you're in trouble, because any attempt to set the record straight will be interpreted as hostility. Now you have two problems: defending yourself against the bullies AND defending yourself against unfair accusations of being a bully.

The root cause of the problem, I think, is the diminishing importance of facts and critical reasoning in the tech community. We are all guilty of this: when was the last time you repeated a factoid about "X doesn't scale", "Y isn't secure", "I heard Z is really evil" without fact-checking it yourself? Be honest. Because of this collective failure to do our own thinking and researching, bullies have a huge first-mover advantage.

I see an direct parallel between the problem of corporate bullying in tech and the problem of partisan bullying in politics. And I think in both cases, there is a big unresolved problem: how do you succeed and do the right thing? How do we collectively change the rules of the game to make bullying and negative communication a less attractive strategy?

I tried really hard to make this a constructive post about a topic I care about. If you interpret any of this as hostile or defensive, that is not at all the intention.


When the top comment is a claim that we at Red Hat post incorrect information and that we at Red Hat are expected to delete said supposedly incorrect information without any technical explanation on why said information is incorrect, I do wonder who is the bully in your opinion.

I posted this entry and I work at Red Hat.


The post is in fact incorrect. The reason Nathan is not sharing more technical details is to protect the security of Red Hat users.

Also, if hypothetically the full details made Red Hat look bad, is it fair to assume you would be calling Nathan hostile for sharing them? In that scenario is there any course of action we could follow that would satisfy you?


The article is about how SELinux helps in mitigating or even blocking paths that would lead to a working exploit.

The article explicitly states the CVE number and the fact that updated packages are available.

The article IMHO doesn't attack nor provoke Docker and its people. Yet the first comment posted here DOES contain direct accusations against Red Hat. I don't think that's helpful nor needed. That's all.

I still think that SELinux and Docker are a good combination and this article helps in understanding why.


The title of the article is "Docker 0-Day Stopped Cold by SELinux." The title strongly implies that SELinux would have prevented the issue in the CVE even without the fixes Docker provides.

Then the text of the article states:

"This CVE reports that if you exec'd into a running container, the processes inside of the container could attack the process that just entered the container. If this process had open file descriptors, the processes inside of the container could ptrace the new process and gain access to those file descriptors and read/write them, even potentially get access to the host network, or execute commands on the host. ... It could do that, if you aren’t using SELinux in enforcing mode."

So, not only does the title make this suggestion, but the text of the article downright says it.

If the claim is wrong, then Docker's security team is right to correct it. However, I think they should do so in a forum other than in the comments of a HN post, be thorough in their explanation, and maintain a professional, polished tone in any communications.

And, of course, Red Hat should correct and/or clarify the post as well.


>The reason Nathan is not sharing more technical details is to protect the security of Red Hat users.

Security through obscurity?


Responsible disclosure.

If you want to be that loose with the definition, passwords and private keys are "security through obscurity".

Giving people time to patch before releasing the details of how it can be exploited isn't a bad security practice.


> you will never see any Docker marketing content criticizing competitors

http://img.scoop.it/vr-SoyYI8yKYsOf0vxriWrnTzqrqzN7Y9aBZTaXo..., from http://www.docker.com/sites/default/files/WP_IntrotoContaine... (page 9)

Don't know if it was marketing material when you published it in that whitepaper, but it definitely became marketing material when the @Docker twitter account tweeted it (https://twitter.com/docker/status/768232653665558528).


I don't think the material you're referring to qualifies as criticizing competitors at all.

- The comparison table is part if an independent study, not authored or commissioned by Docker.

- The table shows the strengths and weaknesses of different container runtimes; weaknesses are highlighted for all of them, including Docker

- The table is used in Docker material to illustrate the point that independent security researchers consider Docker secure. Nowhere do we make the point that other products are insecure. I encourage you to read the whole material and decide for yourself.

- The context for this material was to respond to a massive communication campaign painting Docker as insecure.


Haters gonna hate, man. It's fine to correct misinformation, but in the long run -- and whether they say so or not -- there's much greater respect earned in taking the high ground and not dragging oneself down into the muck of name-calling, ascribing malicious intent to others, and other ill behaviors.

If I were your counselor, I'd advise you to do nothing other than stick to the facts; make the best product you can; delight your customers; take pride in the great work you do; and apologize openly and honestly when you make avoidable mistakes. You can't make everyone happy, so focus on the people you can, and aim to exceed their expectations.


>Haters gonna hate, man. It's fine to correct misinformation, but in the long run -- and whether they say so or not -- there's much greater respect earned in taking the high ground and not dragging oneself down into the muck of name-calling, ascribing malicious intent to others, and other ill behaviors.

This is a naive perspective that doesn't bear out in the real world. It's important to know that by taking the "high road", you are putting yourself at a competitive disadvantage. Someday, those with fewer scruples may have to pay the piper and their dubiously-maintained prosperity may disintegrate ... but then again, maybe not.

Most often, the truth is that large companies are pretty ruthless, and have consolidated such a huge amount of control that it's extremely difficult to do anything about anything they do or have done. They control the messaging, they have a reputation that supersedes any complaint an individual may make, etc. Those companies do slowly atrophy, but usually it's more because they've lost sight of the founder's vision that originally connected with the masses than that they're engaging in questionable tactics.

If you're taking a position out of principle, that has to suffice for itself, because it probably will cost you in material terms.


That whole thing with the docker developer smugly wearing a name badge with 'I reject red hat patches' was just sad.

That's a funny comment, because you are "taking liberties with facts" exactly as lamented by shykes above.

You start with a grain of truth — something that actually happened in reality. In this case, it was a joke protesting systemd hegemony.

Perhaps you thought that joke was in poor taste. But lets leave that aside for now.

So you start with an actual fact. Then, you exaggerate/falsify it, changing the details pretty wildly, and present this story of something that supposedly happened. In fact, nothing like that happened... but it sort of feels like something that might have happened. It vaguely resembles the actual event that did happen (in that, a Docker employee did wear a badge with an opinionated phrase on it at a conference).

The key thing, though, is that what you describe is completely and utterly different from the thing that actually happened in reality world[1].

You might not even be the person who changed the details to make the story more compelling (and false). Maybe you got this information from a post shared on Facebook, or from an email forwarded by your uncle.

Either way, though, the impact of your comment is to pollute the body of discussion and degrade the collective understanding of this topic. (If this process feels familiar, it's because it is exactly the process that eventually caused the failure of the American democracy... just at a much smaller scale).

Personally I don't have any stake in the Docker/RedHat relationship and I don't care about it. I only looked up what actually happened[1] because the idea of a Docker employee wearing an official badge that says "I reject red hat patches" seemed so unlikely to have occurred that it sent my bullshitometer into the red.

Suggestion: when something smells like bullshit, don't eat it without conducting a bit of research.

[1]: https://www.facebook.com/hackerspace.budapest/photos/a.40703...


That badge seems pretty unprofessional to me. I would discipline my employees for that sort of behavior, especially if it occurred at my own conference.

Well, I don't disagree. If I were in charge at Docker, I would be embarrassed and very annoyed by that badge.

But not nearly as furious as I'd be had it actually read "I reject red hat patches" as claimed above.


Meh...I wouldn't. It's definitely an ingroup-humor signaling thing, but I find it hard to believe that someone would read that and seriously get offended unless they're being self-righteous.

This isn't even true. The badge, which was a joke btw, said "I say no to systemd specific PRs."

Very openly history?

Guessing "hostile" was intended.

> Very openly history?

Given krakensden's posting (https://news.ycombinator.com/item?id=13399853):

> Especially given Docker Inc.'s history of counterproductive Red Hat hostility.

I think that it is clear that andrewguenther meant 'hostile' instead of 'history' in (https://news.ycombinator.com/item?id=13400383):

> Docker has been very openly history towards Red Hat in the past.

Let he who has never typed a passing thought rather than the word he meant cast the first stone.


We're working with Red Hat now. Folks can expect more technical details when everyone is on the same page.

That said, the solution is the same as with every other piece of software -- update to latest to get security fixes.


>> "Source: Security at Docker"

In case it is not obvious, the comment above is by Nathan McCauley, who is the Director of Security for Docker.

Source: https://news.ycombinator.com/user?id=bigmac


Why not simply state that when posting? Bad style IMHO.

He stated the source, and information about his Docker affiliation is readily available. HN guidelines discourage signing comments:

Please don't sign comments; they're already signed with your username. If other users want to learn more about you, they can click on it to see your profile.


There's a huge difference between having a generic signature for every comment you post and disclosing an affiliation that adds validity to the claims made in the comment.

It doesn't say "don't sign all your comments", it simply says "don't sign comments". Also, it should be interpreted in the light of the fact that modern netiquette on other sites like Stack Overflow which have usernames is to never sign your posts.

Here it is to disclose affiliation, which else people would forget to check due to nature of 'battle'.

Also, there is an assumption that the signature contains up to date information and/or does not change over time. The latter situation would else impact historical purpose. The signature has changed and does not refer to the position/information related to the moment of writing.

I agree with how both jwildeboer (Jan) and shykes (Solomon) approached this. Much appreciated in this case.

But yes, in a normal situation, this is irrelevant and the username signature is sufficient.


I don't care if Jesus, the director of security in heaven said that.

I'm going to take a look at both arguments and decide for myself. No need to name drop.


>No need to name drop.

Give it a rest. This is a semi-anonymous forum where people's identities aren't tied to their usernames. This isn't name dropping, it's providing helpful context.


The name (or title) drop might effect appropriate urgency, seems legit.

Edit: Don't downvote people trying to help me improve my english. :(


Your usage is actually correct. Which is great, considering many native English speakers get this one wrong. The heuristic we hear in school is something like "use 'affect' as a verb and 'effect' as a noun," which like many grammar heuristics is of course an oversimplification of reality. Usage of effect as a verb isn't super common in general conversation by native English speakers whereas I think most might choose to say something like "establish authority" instead in this case, but still your intention is still clear.

affect

Because the other comment didn't spell it out: effect is correct there. Effect as a verb means something like "to cause to happen". Don't pretend effect/affect is just a noun/verb split. Both words have meanings as both verbs and nouns. It's best to just learn both meanings of each instead of following some rule that's wrong a fair amount of time.

Mary Norris, copy editor at the New Yorker, has a wonderful short video on this: http://www.newyorker.com/culture/culture-desk/comma-queen-af...

I don't care if Jesus, director of grammar in heaven said this....

Just kidding.


Effect is correct here.

effect

What on earth is wrong with you ? This is a security incident.

It's relevant and vital to know the background of people who are making statements like this.

And sorry but not everyone is a kernel engineer who can navigate the truth between RedHat and Docker.


I have no information here, but it's certainly possible that both sides are not willing to publicly disclose the full extent of the vulnerability. I think that's less wise than usual given what Red Hat is writing and how disputed it is, but that's probably their standard practice.

Some of the comments from Red Hat previously implied that they thought the vulnerability could only be exploited via ptrace, which SELinux denied by default for Docker containers. That's definitely not true; ptrace was used in the PoC because it's easy and likely to win the race condition, but you can also grab file descriptors out of /proc/$pid/fd.

However, the blog post appears to show SELinux stopping attacks that don't involve ptrace, because SELinux forbids writing to an open file or an open network socket that has the wrong context. If Docker believes there are attack vectors that aren't covered by the default SELinux policies (such as writing to something that's not a regular file or network socket), they might be unwilling to disclose that too loudly until Red Hat gets around to saying "Uh, actually please patch".


Who filled in for the position of Director of Security between 0 and ~32AD?

AFAICS Red Hat explicitly mentions that updates are available and the benefits of having SELinux in no way means you shouldn't update.

Disclaimer: I work at Red Hat


> the benefits of having SELinux in no way means you shouldn't update

Where is this explicitly mentioned in the post?

I got the opposite impression reading this story titled "Docker 0-Day Stopped Cold by SELinux", with the closing statement "When we heard about this vulnerability we were glad to see that our customers were safe".

I'm sure your customers are glad to hear that as well, but it sounds like the Docker folks have reason to believe SELinux doesn't fully mitigate this vulnerability.


When a 0day hits, you first assess the impact, define your solution and start to work.

FTA

"Fixed packages have been prepared and shipped for RHEL as well as Fedora and Centos."

So updates were made, tested and made available. Our customers typically implement these security related updates very fast.

with that out of the way, the article explains how SELinux can mitigate this and similar issues.

And I am 100% sure that we coordinated the update and changes with Docker because that's how Open Source works.


Do you have a PoC? Even if you don't, please add the information to the security@opencontainers.org thread. I'm guessing that it would involve overwriting program code before the SELinux policy is set by runC?

SELinux used to be one of those things you'd disable immediately upon installing a new RHEL/CentOS box for all the troubles it would cause. But default policies have evolved a lot, making this the wrong thing to do, for a few years now. But people still do it out of habit.

If you ignore SELinux, it won't cause issues besides the ocasional need to run "restorecon" (which one gets into the habit of doing whenever an "access denied" error happens when permissions seem otherwise correct).

But one problem still remains. SELinux is (very) complex and people (myself included) have a very hard time groking its base concepts. This limits adoption greatly, and I'm still to find a decent document that starts from the simple stuff and lets one build a mental concept of how it works before jumping into the more complicated (real-world) use cases.


I worked at Red Hat when SELinux was made mandatory. Every single customer was hit with 'avc denied'. Having to explain that avc was 'Access Vector Cache' and that 'Access Vector Cache' was part of SELinux, and trying to convince somebody to patch the messages to just say 'selinux denied' was nightmarish.

That said: SELinux is one of the only things that can make shared Docker hosting (ie, where the containers are actually isolated from the host and each other) possible.


Correcting self: 'made mandatory' -> 'enabled by default'

You're absolutely right. The problem is that it really doesn't matter if it is amazing an works fine out of the box: people won't use it because SELinux is generally perceived as a technology that is too complicated and will break things if you don't disable it.

At this point, very few people will bother to learn the few bits you need to know to troubleshoot and fix any issues you might encounter; and any benefits you get from using it are not worth the effort.

I used Fedora from 12 to 21 (? I think) and always left SELinux enabled, and it just works. The few things that failed (I remember two issues, one with an experimental build of Chromium and another one with OpenVPN and certificates in $HOME), I submitted a bug report and created my own rule to workaround the issue.

If I managed to run a desktop system with SELinux on, it should be possible (and potentially easier) to use it properly on a server.


> people won't use it because SELinux is generally perceived as a technology that is too complicated

It's amazing how bad things have got with the majority of developers and admins that they refuse to learn things that are difficult and instead simply turn it off. It's not like all of them are too young to remember when nothing in their system was easy and they actually had to learn about what they were doing.


I'm looking at disabling selinux on some of the systems I work on. But I don't think it's for a lack of trying to use and understand it. I don't think it's just a matter of being complicated, to me I find the system cryptic, and information about how to do things correctly very difficult to find information for.

While I agree with your sentiment, I think I have tried to give selinux a fair chance, but I've reached the point where it seems to add more overhead than it benefits us. This is combined with, I'm also likely making mistakes with my configuration, that make it too permissive, because with my limited understanding, I'm putting together modules that make my stuff work, but I don't actually understand the implications of some of the decisions I'm making for permissions.

And my entire team seems to be struggling with selinux, and a little bit fed up with it, because we keep running into it blocking things, simple things, against our intuitions of the way our system and selinux should work.

It might be the perfect access control system, but to me UX is horrible. Or maybe I just hate to admit it, but I've utterly failed at building a mental model for how it works, and how I can effectively interact with it to do what I need to do.

So while you might be right, at least in the case of selinux, I'm not sure I agree that it's simply a matter of developers and admins refusing to learn things that are difficult. In the selinux case, I think it goes deeper, where it creates a cognitive load, that someone needs to invest a significant amount of effort to truly learn it.


Just an agreement. I make a point of using some LSM everywhere I can, but in practice selinux is still hard to debug properly. The level of abstraction in module doesn't help here. System-wide integration means that patching single app profile sometimes involves patching the system profile package, sometimes just one module. And that's just on redhat-like system - anything else gives no guarantees about selinux working at all.

Sure, apparmor has fewer options, but it's trivial to manage a profile along with the package itself. The learning and reporting system is much simpler. And there's no need for debugging the "where did I miss the context setting this time" problem in deployment.

I can deal with both and either one is needed. But selinux simply wastes my time way too often.


> it creates a cognitive load, that someone needs to invest a significant amount of effort to truly learn it.

To benefit from SELinux, I don't think you need to "truly learn" it... but you do need to invest some effort developing troubleshooting skills that are specific to SELinux. But the thrust of your point is right. We need to invest some effort to benefit from it.

I think postings like RedHat's are useful because they show us in a concrete way why it might be worth the small effort to develop the troubleshooting skills, or even worth the large effort to really understand SELinux.


There is a line between something being difficult and something causing any part of the system to randomly stop working with no explanation whatsoever except -if you're lucky- a single log line somewhere that will give no result when searched for on google.

Well - devs and admins spend the minimum effort required to reach their primary goal. And I believe most of the time that it is in line with their employer's policy.

I'm too young to remember how it was in the old times (would be glad to hear a story or 2), but since my very first job, the internal policy (the real one - not the one presented to customer or auditors) has always been "scew firewall, selinux, principle of least privilege. Disable everything so it works right now and grant the dev team root access to the prod environments - they need it for an urgent customer issue!".


Humans are designed to obtain their goals while conserving as much energy as possible. As a rule, people will always take the lowest-effort route to get what they want. At a macro scale, this can only be counteracted by designing systems that strongly discourage specific low-effort-but-harmful routes, and even then, there will always be some sector of the population that doesn't grok the downside of taking the "easier" way.

I think one of the big things that has impacted SELinux adoption is that everyone has sort of seen it as a proactive booster rather than a really necessary part of a secure environment. I'm sure a lot of that is because lots of people are used to administering systems that were around before SELinux (and similar) was. For something that's perceived as a bonus point, it interferes far too often.

Most people don't realize SELinux is on until it does something really bad like stopping a database from restarting correctly or otherwise harming what's supposed to be a stable environment. When those are the stakes, SELinux does not have, or at least does not make immediately clear, a sufficient value proposition to incentivize the admin to fix the rules rather than just turning the whole thing off.

For example, to contrast with the OP, when SELinux stops Docker from doing something, the impulse is not going to be "Yay, SELinux stopped Docker from doing something dangerous! Thank you glorious SELinux!" Instead, it will be "Ugh, SELinux again getting in the way of stuff. I just need to disable that. I get enough headaches from Docker as-is and SELinux probably just isn't modern enough to handle my uber-charged stack with all of it's new-fangled features. Disabling!"


To be fair, documentation on SELinux sucks. Git is also complex, but there is a lot of good documentation. When I first tried to develop an SELinux policy it took me a month to learn. Too many things are undocumented, or are documented but not clear where.

A lot of this seems driven by IT departments cut to the bone or outsourced for minimum bid offshore.

In either case, enterprise IT doesn't have the benefit of spending time fiddling that used to exist.


Selinux creates a lot of extra work for very little visible benefit.

Hopefully posts like this one help make the benefit more visible.

And much of the extra work is one-time work, not ongoing effort.


> that they refuse to learn things that are difficult and instead simply turn it off.

I wish I could do this with our Logstash cluster. It requires so much hand-holding, and troubleshooting can be quite opaque. Last night the logstash indexing service had 'just stopped', and was sending 57000-character-long json loglines into its own log.

And if they do fix a bug, you can't just upgrade one component, you have to upgrade logstash and elasticsearch and kibana... and maybe whichever beats you're using.


"I used Fedora from 12 to 21 (? I think) and always left SELinux enabled, and it just works."

I've had all sorts of weird errors over the years when I set up new CentOS machines.. one thing or the other. And then I remember to turn off SELinux, and it suddenly works. Over and over again.

It's going to take a lot of convincing to get me to leave SELinux on when it's caused so many problems over the years.


The thing that kills it for me is that Fedora ships with broken default configuration.

Install Fedora, reboot and log in. Chances are within minutes you'll start seeing "selinux denied" messages popping up, complaining about services, files and policies you've never heard of. How is anyone but a seasoned RHEL admin supposed to know what to do with that?


This might have been true many years ago, but it certainly isn't true nowadays (at least... not on the Workstation spin, can't vouch for others).

I've had it happen to me in the last week, with Fedora 25 Workstation Edition, the latest and greatest.

This seems like a good example of YMMV :)


Hope you filed a bug in this instance! Fedora makes this really easy to do.

Sort of. I tried the latest workstation release recently and the installer was broken.. you need a bug tracker account to file a bug, so you need your browser to work, which is rough when your system doesn't.

I really should have, as I can't remember the specifics now.

I'll try again in a VM and see if I can recreate the issue.


That's the issue with LSMs in general. They all work, but historically have been difficult to configure and impossible to maintain. That's changing.

There used to be only one (SELinux), however, there's competition now from other LSMs. Smack, AppArmor, Tomoyo, etc. In part, that's why SELinux is improving.

I've tried them all and settled on Tomoyo. The documentation is outstanding and it is (to me at least) the easiest LSM to reason about and configure.

http://tomoyo.osdn.jp/


How up2date is tomoyo, though? The Changelog and updates on the front page seem to indicate that not much happened after 2015.


I've had to create apparmour profiles in the past. There is some good tooling, but it's still not trivial.

> I'm still to find a decent document that starts from the simple stuff and lets one build a mental concept of how it works before jumping into the more complicated (real-world) use cases.

Well, there is this:

https://people.redhat.com/duffy/selinux/selinux-coloring-boo...

You can link it to repeat offenders who disable SELinux. (That might not be a good idea.)


That sounds pretty cool, but where do rules come from?

Suppose some author wrote some daemon. Is the packager responsible for writing the rules? It sounds like having packagers understand SELinux rules is a lot of responsibility, and if upstream is cross-platform, they might not care about such specific needs so as to provide it.

Also, what happens if I write some small app? Do I need to write its rules? If it has no rules applied to it, then it's basically game over, because SELinux sounds like it works ONLY if it applies to all processes.


A packager is generally responsible for selinux policies if they aren't suitable for inclusion in the core policy, as a developer if you want to write them please do but certain aspects rely on things like where binaries and application data are stored so you can't always write a policy that won't need tweaking on a specific distribution.

SELinux policies take some time to write the first time or two, but typically running in permissive mode and running your app with a permissionless context will give you everything you need to include in one.

Admins have the hard job when they move default data directories around, takes time to get used to running 'semanage fcontext' in addition to setting file system permissions.


Read that a while ago and it didn't make it click for me.

If I had just stumbled across it without a recommendation and without recognising the bame of the author I'd easly dismissed it as some kind of trolling but thats maybe just me.

What I want is:

1. How to do tasks x, y and z (with explanations on why).

2. Complete documentation of all commands, settings files etc.

Usually 2. is somewhat covered in the official docs and 1. is available in blog posts etc. Last I checked 1 was not covered in any way when it comes to SELinux.

Making a coloring book out of it is nowhere on my list.


To late to edit but let me be perfectly clear that I am totally fine with people making coloring books etc, i just wish there was more how-to, but that might just be me.

> SELinux is (very) complex and people (myself included) have a very hard time groking its base concepts.

I wonder what a clean slate OS design would look like. One that satisfied the same requirements without any concerns about backward compatibility with POSIX history.

Does anyone know an OS with vaguely this goal?


I'm probably biased, but Solaris' role-based access control system has been very successful among its particular customer base:

https://docs.oracle.com/cd/E23824_01/html/821-1456/rbac-1.ht...

It wasn't necessary to throw away POSIX history or concerns either.


Aside from well known options like Plan 9 and QNX:

- https://pdos.csail.mit.edu/archive/exo/

- https://atheos.syllable.org/


How about Magenta [0] by Google, on top of Little Kernel (LK), a neat and modern microkernel design? (It's part of Google's work-in-progress complete OS named Fuchsia, which appeared briefly a while ago on tech news sites.)

[0] https://fuchsia.googlesource.com/magenta


Microsoft's Midori project. It never saw the light of day, but there are some very interesting blog posts about it. The Redox project has some leanings in this direction.

Robigalia is interesting but not nearly ready: https://gitlab.com/robigalia (their website has a cert error right now. It seems like I've seen a lot of those these days)

It's a rust userland built upon SEL4. SEL4 is very simplified in order to meet their verification goals so robigalia has to implement some interesting resource sharing primitives on top of it to get things to work. It could be interesting.


Not sure if you're only considering Unix/BSD-ish OSes here or not. But there are perhaps hundreds of such projects.

http://tunes.org/cliki/operating_20systems.html

You may want to dig around that entire site to get an idea of what people have tried to do (and frequently failed).


Yes, Trusted UNIX where you have a very complex MAC framework

Try the first two to be certified to high-assurance security:

http://www.cse.psu.edu/~trj1/cse443-s12/docs/ch6.pdf

The MLS model was too difficult to adapt to commercial use. Biba was good for stopping malware from overwriting files. They still preferred something more flexible. SCC then invented type enforcement in another high-assurance system:

https://web.archive.org/web/20160311233659/http://www.cyberd...

Flask architecture was combining that tech with a microkernel. SCC, acquired by McAfee, added type enforcement to a BSD OS for their Sidewinder firewall. The next work by Mitre was proof-of-concept for OSS by adding it to Linux. That and a pile of incremental additions is called SELinux. I'm sure you'll find the LOCK design a lot cleaner as it was originally intended. ;)

Also worth noting are the KeyKOS system (esp with KeySAFE), the capability-security machines, and one language-based mechanism:

http://www.cis.upenn.edu/~KeyKOS/

http://www.cs.washington.edu/homes/levy/capabook/index.html

http://www4.cs.fau.de/Projects/JX/

These collectively should keep you busy for a while. They're the kind of thing worth imitating or building on.


> But people still do it out of habit.

It's very hard to come back from such a popular image of "100% broken and must be disabled immediately". Even if SELinux evolves to absolute perfection, the damage to its image is done, and that will take a long time to change, regrettably. It should not have been shipped so green.

> If you ignore SELinux, it won't cause issues besides the ocasional need to run "restorecon" (which one gets into the habit of doing whenever an "access denied" error happens when permissions seem otherwise correct).

Sound like it's still pretty broken, IMHO. I should never see an "access denied" error on a host I control, unless I misconfigured it.

The truth is, the defaults MUST work on all common scenarios all of the time for these things to be successful. Otherwise, people will only see the downsides (and the upsides are rarely visible, and rarely outweigh the downsides).


> I should never see an "access denied" error on a host I control, unless I misconfigured it.

What do you mean? Even stock UNIX will give a permission denied error if you try to run an executable without `chmod +x`-ing it or `rm -rf /boot` as a regular user.


> If you ignore SELinux, it won't cause issues besides the ocasional need to run "restorecon" (which one gets into the habit of doing whenever an "access denied" error happens when permissions seem otherwise correct).

You also have to put things in the correct place. For instance, your VPN certificates should be in $HOME/.cert so restorecon knows they should have the home_cert_t label.


It doesn't help that cloud providers, e.g. DigitalOcean disable Selinux

When I asked them about this and why they do not even tell the customer about this, they said it is so that they can reset the password when requested through the control panel.

If their "reset password" implementation took care to not mess up the MAC labels on edited files, disabling SELinux wouldn't be necessary.

See virt-rescue(1) (and libguestfs-tools in general) for how to do it properly.


Seems like they could write an SELinux rule to allow that behavior...

Yeah, but it's basically same as `setenforce 0`.

You need to be in the context that will allow the change of password. For example, if this is an agent, you need to be able to execute code in the agent. If this is from a DHCP hook in early boot, you need to execute in that hook.

That's totally not equivalent to the use of "setenforce 0".


Yes, you are right. However, when I calculating risks and protection/price ratio, I see no difference. System with `setenforce 0` is equal to system with the rule for first two significant digit, because root password is most significant and risk of attack through webui is same and high too. Other risks are less significant than that and SELinux does not reduce them significantly.

It sounds like you're saying the security position can be simplified to one dimension. It can't. Unless you're able to provide risk and exposure for all users and somehow make them the same for everyone (they're not).

Then you write that SELinux doesn't reduce the risk. It definitely does that for webapps executing wrong commands, utility services being exploited for local access, many attempts at race conditions via shared directories, etc. For example almost all interesting use cases for imagetragick exploits are severely limited by properly configured selinux. Once in a while there's going to be an issue with a trivial exploit which everybody and their dog will use to scan the whole internet before you have time to patch. This is what LSM is great to protect against.


I'm not saying that risk is not reduced. I'm saying SELinux is not effective, so it's better to spent my time/money on something that can reduce risk significantly.

I saw no 0-day exploits to date which are stopped by SELinux. For example, a trojan can use apache process as malware host, without reading/writing to disk at all. SELinux will not stop that even in theory.


Is there a good reason why it's hard to make SELinux easy to use?

Because parts of SELinux which are easy to use are already covered by POSIX.

Since when is a security issue which is known to and patched by the vendor a "0-day"? Have there been any reported exploitations of this vulnerability in the wild before it was publicly disclosed with a patch made immediately available?

No, there were not, and it was disclosed to related vendors some weeks earlier. Some people seem to thing "0 day" is a l33t term for "vulnerability" it seems.

I think all they are suggesting is that running SELinux can protect you from vulnerabilities that no one knows about yet. Obviously they can't give an example of an vulnerability that is still a 0 day.

I think they mean "SELinux was stopping this vulnerability from being exploited before it was even discovered."

SELinux is one of our main protection against user abuse. Our project (webminal.org) provides free terminal access anyone. Thus we need protection. Behind the scenes we rely on things like SELinux/Quota/Pam.d/limits.conf/rootkits etc But SELinux has a learning curve than others, but its worth a ton.

I'm not sure if the situation has changed, but I recall installing Fedora (and another distro I cannot recall) years ago, and SELinux would kill sshd when a connection was received, on a clean, out-of-the-box installation, making the host inaccessible.

These sort of super-critical bug make software go immediately into my blacklist, and it's very hard to come back from that - it basically meant that it had to be disabled immediately, because it's defaults were completely broken.


I installed Fedora Server 25 a week or so ago on a small server, sshd was open (with an extremely strong random password) and fully functional at install time. SELinux has caused no issues, I actually chose to install docker in the installer at install-time and it came pre-configured to play nice with SELinux.

Firewalld on the other hand, I'm still figuring out (firewall-cmd is useful, but trying to translate iptables rules -> firewalld is proving harder than I expected)


One of the reasons I like OpenBSD. Linux has a habit of taking well established things and replacing them with incompatible things (firewalld, systemd for example).

I have enough to do, I don't need to throw away years of acquired experience every few releases. It's one thing if the replacements are clearly better, but for me they just seem to be new, different ways to do the same things.


I'd argue that (at least in recent years) OpenBSD does a lot of replacing as well, but I happen to like their direction, where the replacements are simpler rather than more complex.

firewalld is just a daemon frontend to iptables. The underlying firewall hasn't changed since Linux moved from ipchains to iptables.

Every time I mess with Fedora I have to screw around with selinux trying to figure out how to make ~/.ssh/authorized_keys work. Every time I need to google the stupid magic incantation that makes things work (that normally should "just work") because I can just never remember it.

Does AppArmor also block this I wonder?

AFAIK as I know, the answer is yes. In fact if you look at the Docker apparmor documentation you can see an example where ptrace was blocked https://docs.docker.com/engine/security/apparmor/

I just want to update this to clarify that it blocks ptrace, but this is only part of the issue and you shouldn't rely on AppArmor to mitigate this CVE entirely.

I think it could.

Here's a write-up showing how AppArmor can protect Docker containers and the underlying host... quote from the article, "So without even patching the container we have prevented rouge pid from spawning using a correct security profile with AppArmor."

    https://scottydoesntknow.io/container-secure-right/

AppArmor and SELinux share the same set of LSM hooks underneath, so there's comparatively little difference in capability. The big fight is over the philosophy of how they are configured (very broadly: AppArmor tends to tolerate ad hoc configuration by doing what you want instead of what you say, and SELinux like to break everything it doesn't understand perfectly).

I really wish SELinux would evolve some better configuration tools/culture, but after a decade and a half I despair of this ever happening. Every single Fedora release I leave it enabled on my personal machine thinking "THIS will be the moment where I really puzzle it out". Then I disable it a week later when I realize how much annoying work configuring rules for my rando backup and device management scripting will be, and when I see that it still lacks rules for a bunch of in-distro tools I need to use.


Last I looked, Apparmor still applied rules based on file names, not on types assigned to the underlying objects. That's a big limit on writing correct policies: every hardlink is like a firewall-spanning device.

I see it mentioned so often, but in practice... is that really an issue for you? You need to have root privs with unrestricted access (or at least hardlink creation in the directory explicitly allowed by apparmor) for that. That means the attack would have to look something like:

1. gain access to the system (unrelated exploit) 2. elevate to root with profile which can create hardlinks at all 3. have access to a directory unrestricted in the profile 4. have local, unrestricted application which can be exploited by hardlink manipulation

This is theoretically possible on some systems. But it's a massive effort and fairly easily mitigated by having a profile for all running services which disallows hardlinks in the first place. As far as risk of service exploitation goes, this should be a fairly minimal one. (and requires targeted approach)


AppArmor is a DAC+MAC and SELinux is an RBAC+MAC. AppArmor is not nearly as powerful. Grsec is much more effective than AppArmor at preventing a wide range of attacks, but SELinux is more or less king.

My understanding is: yes, it is blocked by AppArmor.

Based on seeing email chatter about this report in the context of Cloud Foundry[0]. Under the hood CF uses runC, partly to allow AppArmor to be applied.

Disclosure: I work for Pivotal, the majority contributor of engineering to Cloud Foundry.

[0] https://www.cloudfoundry.org/cve-2016-9962/


I would recommend not giving that advice, and recommend that all your customers upgrade. There are a number of ways to exploit this, and as stated elsewhere on this thread Red Hat's advice that SeLinux entirely mitigates this is not correct, it is highly likely that it can be exploited on Cloud Foundry too. I would never advise people to not upgrade software when there is a CVE because someone says there is a mitigation, it is very high risk.

Disclosure: I worked on testing the fixes for this CVE.


Thanks for your comment. We (Garden - Cloud Foundry container project) released with the runc patch yesterday (https://github.com/cloudfoundry/garden-runc-release/releases...).

It was our understanding from the original report that the vulnerability was mitigated by AppArmor disabling ptrace, by no user process running as pid 1 inside the container, and because in CF buildpack apps, user processes run as unprivileged users. This is the stance communicated in the CVE report.

However, with some further consideration and updated information yesterday, we decided it would be prudent to patch and release immediately to be on the safe side. This was communicated to the Cloud Foundry Security team.


User processes running as unprivileged users may be sufficient to mitigate. AppArmor did not always during testing. But upgrade is highly recommended as there were several ways to exploit and different races, including one related kernel bug that was fixed in very recent releases.

> It was our understanding from the original report that the vulnerability was mitigated by AppArmor disabling ptrace, by no user process running as pid 1 inside the container, and because in CF buildpack apps, user processes run as unprivileged users. This is the stance communicated in the CVE report.

I'd have to think about this further, but I'm not convinced that would be sufficient protection (accessing /proc/$pid/fd has a different set of access requirements to ptrace -- it's a dumpability check basically). However, since you've already sent patches around it's all good.

Disclosure: I discovered, wrote patches for and helped with coordination of this vuln.


If you'd like to email me, I can connect you to our security triage team to argue the case.

If we're wrong, we'll change it.

Edit: looks like we already did.


Depends on how well it's configured. My guess is that an out-of-the-box profile probably wouldn't.

Just a pity that SELinux is widely disabled because it ends up being a pain.

Productivity is also stopped cold by SELinux :P

Selinux is badly designed and it's not suprisingly people don't use it. Experts are supposed to simplify complexity. You can't design convulated and complex applications that are user hostile and break things without warning and then accuse people of laziness. Few who deploy apps take security casually and selinux is not the only way to gain security.

In a typical scenario when deploying software things can already get hairy and with selinux in the way you could end up going down multiple rabbit holes and squander hours only to discover selinux is somehow in the way disabling some functionality but not logging clearly what exactly it is disabling with proper messages. That's why most advice online is to disable it.

Given its connected to the NSA and Redhat tried its best to get it into the kernel at one time is all the more reason for anyone concerned about NSA to avoid it. Security experts like the author of Grsec also doesn't think too highly of it.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: