Hacker News new | past | comments | ask | show | jobs | submit login
Backdoor in upstream xz/liblzma leading to SSH server compromise (openwall.com)
4549 points by rkta 49 days ago | hide | past | favorite | 1849 comments



Very annoying - the apparent author of the backdoor was in communication with me over several weeks trying to get xz 5.6.x added to Fedora 40 & 41 because of it's "great new features". We even worked with him to fix the valgrind issue (which it turns out now was caused by the backdoor he had added). We had to race last night to fix the problem after an inadvertent break of the embargo.

He has been part of the xz project for 2 years, adding all sorts of binary test files, and to be honest with this level of sophistication I would be suspicious of even older versions of xz until proven otherwise.


GitHub has suspended @JiaT75's account.

EDIT: Lasse Collin's account @Larhzu has also been suspended.

EDIT: Github has disabled all Tukaani repositories, including downloads from the releases page.

--

EDIT: Just did a bit of poking. xz-embedded was touched by Jia as well and it appears to be used in the linux kernel. I did quick look and it doesn't appear Jia touched anything of interest in there. I also checked the previous mirror at the tukaani project website, and nothing was out of place other than lagging a few commits behind:

https://gist.github.com/Qix-/f1a1b9a933e8847f56103bc14783ab7...

--

Here's a mailing list message from them ca. 2022.

https://listor.tp-sv.se/pipermail/tp-sv_listor.tp-sv.se/2022...

--

MinGW w64 on AUR was last published by Jia on Feb 29: https://aur.archlinux.org/cgit/aur.git/log/?h=mingw-w64-xz (found by searching their public key: 22D465F2B4C173803B20C6DE59FCF207FEA7F445)

--

pacman-static on AUR still lists their public key as a contributor, xz was last updated to 5.4.5 on 17-11-2023: https://aur.archlinux.org/cgit/aur.git/?h=pacman-static

EDIT: I've emailed the maintainer to have the key removed.

--

Alpine was patched as of 6 hours ago.

https://git.alpinelinux.org/aports/commit/?id=982d2c6bcbbb57...

--

OpenSUSE is still listing Jia's public key: https://sources.suse.com/SUSE:SLE-15-SP6:GA/xz/576e550c49a36... (cross-ref with https://web.archive.org/web/20240329235153/https://tukaani.o...)

EDIT: Spoke with some folks in the package channel on libera, seems to be a non-issue. It is not used as attestation nor an ACL.

--

Arch appears to still list Jia as an approved publisher, if I'm understanding this page correctly.

https://gitlab.archlinux.org/archlinux/packaging/packages/xz...

EDIT: Just sent an email to the last committer to bring it to their attention.

EDIT: It's been removed.

--

jiatan's Libera info indicates they registered on Dec 12 13:43:12 2022 with no timezone information.

    -NickServ- Information on jiatan (account jiatan):
    -NickServ- Registered : Dec 12 13:43:12 2022 +0000 (1y 15w 3d ago)
    -NickServ- Last seen : (less than two weeks ago)
    -NickServ- User seen : (less than two weeks ago)
    -NickServ- Flags : HideMail, Private
    -NickServ- jiatan has enabled nick protection
    -NickServ- *** End of Info ***
/whowas expired not too long ago, unfortunately. If anyone has it I'd love to know.

They are not registered on freenode.

EDIT: Libera has stated they have not received any requests for information from any agencies as of yet (30th Saturday March 2024 00:39:31 UTC).

EDIT: Jia Tan was using a VPN to connect; that's all I'll be sharing here.


Just for posterity since I can no longer edit: Libera staff has been firm and unrelenting in their position not to disclose anything whatsoever about the account. I obtained the last point on my own. Libera has made it clear they will not budge on this topic, which I applaud and respect. They were not involved whatsoever in ascertaining a VPN was used, and since that fact makes anything else about the connection information moot, there's nothing else to say about it.


[flagged]


I am not LE nor a government official. I did not present a warrant of any kind. I asked in a channel about it. Libera refused to provide information. Libera respecting the privacy of users is of course something I applaud and respect. Why wouldn't I?


Respect not giving out identifying information on individuals whenever someone asks, no matter what company they work for and what job they do? Yes. I respect this.


It's called keeping integrity on not disclosing private info any users from your network, regardless whether they are bad actors.

I respect them for that.

Violating that code is just as bad as the bad actor slipping backdoors.


I hope you aren’t in control of any customer data.


> EDIT: Github has disabled all Tukaani repositories, including downloads from the releases page.

Why? Isn't it better to freeze them and let as many people as possible analyze the code?


Good question, though I can imagine they took this action for two reasons:

1. They don't have the ability to freeze repos (i.e. would require some engineering effort to implement it), as I've never seen them do that before.

2. Many distros (and I assume many enterprises) were still linking to the GitHub releases to source the infected tarballs for building. Disabling the repo prevents that.

The infected tarballs and repo are still available elsewhere for researchers to find, too.


They could always archive it. Theoretically (and I mean theoretically only), there's another reason for Microsoft to prevent access to repo: if a nation state was involved, and there've been backdoor conversations to obfuscate the trail.


Archiving the repo doesn't stop the downloads. They would need to rename it in order to prevent distro CI/CD from keeping downloading untrustworthy stuff.


Distros downloading directly from GitHub deserve what they get.


Maybe one can get the code from here. New commits being added it seems.

https://git.tukaani.org/


The latest commit is interesting (f9cf4c05edd14, "Fix sabotaged Landlock sandbox check").

It looks like one of Jia Tan's commits (328c52da8a2) added a stray "." character to a piece of C code that was part of a check for sandboxing support, which I guess would cause the code to fail to compile, causing the check to fail, causing the sandboxing to be disabled.


Lasse has also started his own documentation on the incident.

https://tukaani.org/xz-backdoor/


Shouldn't they have tests running to ensure that the check works on at least some systems?


What do you mean "tests"?


Have a system were you wxpect the sandboxing to work and have an automated check that it compiles there?


Part of the backdoor was in the tests. The attacker in this case could easily have sabotaged the test as well if a test was required.


If your project becomes complex enough eventually you need tests for the configure step. Even without malicious actors its easy to miss that a compiler or system change broke some check.


You can still find the source everywhere, if you look for it. Having a fine-looking page distribute vulnerable source code is a much bigger threat.


You can find it on archive. Someone archived it last night


[flagged]


Don't agree here. I've only ever seen GitHub do this in extreme circumstances where they were absolutely warranted.


The alpine patch includes gettext-dev which is likely also exploited as the same authors have been pushing gettext to projects where their changes have been questioned


What do you mean?


Look at the newest commits, do you see anything suspicious:

https://git.alpinelinux.org/aports/log/main/gettext

libunistring could also be affected as that has also been pushed there


Seeing so many commits that are "skip failing test" is a very strong code smell.


Yes, but it is often a sad reality of trying to run projects mainly written for glibc on musl. Not many people write portable C these days.


It's still the wrong way to go about things. Tests are there for a reason, meaning if they fail you should try to understand them to the point where you can fix the problem (broken test or actual bug) instead of just wantonly distabling tests until you get a green light.


> do you see anything suspicious

No.

> libunistring could also be affected as that has also been pushed there

What do you mean by "that"?


FWIW, that's mingw-w64-xz (cross-compiled xz utils) in AUR, not ming-w64 (which would normally refer to the compiler toolchain itself).


Good catch, thanks :)


It appears to be an RCE, not a public key bypass: https://news.ycombinator.com/item?id=39877312


I've posted an earlier WHOWAS of jiatan here: https://news.ycombinator.com/item?id=39868773


Asking this here too: why isn't there an automated A/B or diff match for the tarball contents to match the repo, auto-flag with a warning if that happens? Am I missing something here?


The tarballs mismatching from the git tree is a feature, not a bug. Projects that use submodules may want to include these and projects using autoconf may want to generate and include the configure script.


> The tarballs mismatching from the git tree is a feature, not a bug.

A feature which allowed the exploit to take place, let's put it that way.

Over here: https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78b...

> The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.

Multiple suggestions on that thread on how that's a legacy practice that might be outdated, especially in the current climate of cyber threats.

Someone even posted a more thorough gist on what could be done to increase transparency and reduce discrepancies between tarballs and repos: https://gist.github.com/smintrh78/97b5cb4d8332ea4808f25b47c8...



"lol"

> Those days are pretty much behind us. Sure, you can compile code and tweak software configurations if you want to--but most of the time, users don't want to. Organizations generally don't want to, they want to rely on certified products that they can vet for their environment and get support for. This is why enterprise open source exists. Users and organizations count on vendors to turn upstreams into coherent downstream products that meet their needs.

> In turn, vendors like Red Hat learn from customer requests and feedback about what features they need and want. That, then, benefits the upstream project in the form of new features and bugfixes, etc., and ultimately finds its way into products and the cycle continues.

"and when the upstream is tainted, everyone drinks poisoned water downstream, simple as that!"


account is back online https://github.com/JiaT75


Hopefully still locked just visible so people can find and alayze his contributions.


I think this has been in the making for almost a year. The whole ifunc infrastructure was added in June 2023 by Hans Jansen and Jia Tan. The initial patch is "authored by" Lasse Collin in the git metadata, but the code actually came from Hans Jansen: https://github.com/tukaani-project/xz/commit/ee44863ae88e377...

> Thanks to Hans Jansen for the original patch.

https://github.com/tukaani-project/xz/pull/53

There were a ton of patches by these two subsequently because the ifunc code was breaking with all sorts of build options and obviously caused many problems with various sanitizers. Subsequently the configure script was modified multiple times to detect the use of sanitizers and abort the build unless either the sanitizer was disabled or the use of ifuncs was disabled. That would've masked the payload in many testing and debugging environments.

The hansjans162 Github account was created in 2023 and the only thing it did was add this code to liblzma. The same name later applied to do a NMU at Debian for the vulnerable version. Another "<name><number>" account (which only appears here, once) then pops up and asks for the vulnerable version to be imported: https://www.mail-archive.com/search?l=debian-bugs-dist@lists...


1 week ago "Hans Jansen" user "hjansen" was created in debian and opened 8 PRs including the upgrade to 5.6.1 to xz-utils

From https://salsa.debian.org/users/hjansen/activity

Author: Hans Jansen <hansjansen162@outlook.com>

- [Debian Games / empire](https://salsa.debian.org/games-team/empire): opened merge request "!2 New upstream version 1.17" - March 17, 2024

- [Debian Games / empire](https://salsa.debian.org/games-team/empire): opened merge request "!1 Update to upstream 1.17" - March 17, 2024

- [Debian Games / libretro / libretro-core-info](https://salsa.debian.org/games-team/libretro/libretro-core-i...): opened merge request "!2 New upstream version 1.17.0" - March 17, 2024

- [Debian Games / libretro / libretro-core-info](https://salsa.debian.org/games-team/libretro/libretro-core-i...): opened merge request "!1 Update to upstream 1.17.0" - March 17, 2024

- [Debian Games / endless-sky](https://salsa.debian.org/games-team/endless-sky): opened merge request "!6 Update upstream branch to 0.10.6" - March 17, 2024

- [Debian Games / endless-sky](https://salsa.debian.org/games-team/endless-sky): opened merge request "!5 Update to upstream 0.10.6" - March 17, 2024

- [Debian / Xz Utils](https://salsa.debian.org/debian/xz-utils): opened merge request "!1 Update to upstream 5.6.1" - March 17, 2024


That looks exactly like what you'd want to see to disguise the actual request you want, a number of pointless upstream updates in things that are mostly ignored, and then the one you want.


glad I didn't merge it ...


Make it two years.

Jia Tan getting maintainer access looks like it is almost certainly to be part of the operation. Lasse Colling mentioned multiple times how Jia has helped off-list and to me it seems like Jia befriended Lasse as well (see how Lasse talks about them in 2023).

Also the pattern of astroturfing dates back to 2022. See for example this thread where Jia, who has helped at this point for a few weeks, posts a patch, and a <name><number>@protonmail (jigarkumar17) user pops up and then bumps the thread three times(!) lamenting the slowness of the project and pushing for Jia to get commit access: https://www.mail-archive.com/xz-devel@tukaani.org/msg00553.h...

Naturally, like in the other instances of this happening, this user only appears once on the internet.


Also I saw this hans jansen user pushing for merging the 5.6.1 update in debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1067708


From: krygorin4545 <krygorin4545@proton.me> To: "1067708@bugs.debian.org" <1067708@bugs.debian.org> Cc: "sebastian@breakpoint.cc" <sebastian@breakpoint.cc>, "bage@debian.org" <bage@debian.org> Subject: Re: RFS: xz-utils/5.6.1-0.1 [NMU] -- XZ-format compression utilities Date: Tue, 26 Mar 2024 19:27:47 +0000

Also seeing this bug. Extra valgrind output causes some failed tests for me. Looks like the new version will resolve it. Would like this new version so I can continue work.

--

Wow.

(Edited for clarity.)


Wow, what a big pile of infrastructure for a non-optimization.

An internal call via ifunc is not magic — it’s just a call via the GOT or PLT, which boils down to function pointers. An internal call through a hidden visibility function pointer (the right way to do this) is also a function pointer.

The even better solution is a plain old if statement, which implements the very very fancy “devirtualization” optimization, and the result will be effectively predicted on most CPUs and is not subject to the whole pile of issue that retpolines are needed to work around.


Right, IFUNCs make sense for library function where you have the function pointer indirection anyway. Makes much less sense for internal functions - only argument over a regular function pointer would be the pointer being marked RO after it is resolved (if the library was linked with -z relro -z now), but an if avoids even that issue.


> because the ifunc code was breaking with all sorts of build options and obviously caused many problems with various sanitizers

for example, https://github.com/google/oss-fuzz/pull/10667


>Hans Jansen and Jia Tan

Are they really two people conspiring?

Unless proven otherwise, it is safe to assume one is just a pseudonym alias of the other.


or possibly just one person acting as two, or a group of people?


Or a group managing many identities, backdooring many different projects



Does anybody know anything about Jia Tan? Is it likely just a made up persona? Or is this a well-known person.


It’s certainly a pseudonym just like all the other personas we’ve seen popping up on the mailing list supporting this “Jia Tan” in these couple of years. For all intents and purposes they can be of any nationality until we know more.


It seems like Hans Jansen has also an account on proton.me (hansjansen162@proton.me) with the Outlook address configured as recovery-email.


Yesterday sure was fun wasn't it :p Thanks for all your help/working with me on getting this cleaned up in Fedora.


PSA: I just noticed homebrew installed the compromised version on my Mac as a dependency of some other package. You may want to check this to see what version you get:

   xz --version
Homebrew has already taken action, a `brew upgrade` will downgrade back to the last known good version.


I also had a homebrew installed affected version.

I understand it's unlikely, but is there anything I can do to check if the backdoor was used? Also any other steps I should take after "brew upgrade"?


Quoting[1] from Homebrew on Github:

>> Looks like that Homebrew users (both macOS and Linux, both Intel and ARM) are unlikely affected?

> Correct. Though we do not appear to be affected, this revert was done out of an abundance of caution.

[1] https://github.com/Homebrew/homebrew-core/pull/167512


Thanks for this. I just ran brew upgrade and the result was as you described:

  xz 5.6.1 -> 5.4.6


sorry, what exact version(s) is the one(s) affected again?

(or SHAs, etc.)

(EDIT: 5.6.0 and 5.6.1 ?)

(EDIT 2: Ooof, looks like the nix unstable channel uses xz 5.6.1 at this time)

I use Nix to manage this stuff on Mac, not Homebrew...


GitHub disabled the xz repo, making it a bit more difficult for nix to revert to an older version. They've made a fix, but it will take several more days for the build systems to finish rebuilding the ~220,000 packages that depend on the bootstrap utils.



Lol they shouldn't be relying on GitHub in the first place.


What should they be relying on instead? Maybe rsync everything to an FTP server? Or Torrents? From your other comments, you seem to think no one should ever use GitHub for anything.


Is it actually compromised on homebrew though? I guess we can't be sure but it seemed to be checking if it was being packaged as .deb or .rpm?


Is 5.2.2 safe? Just 5.6.0 and 5.6.1 are bad?


Is it normal that when I try to uninstall xz it is trying to install lzma?


It means that `xz` was depended upon by something that depends on eg "xz OR lzma"


because of it's "great new features"

"great" for whom? I've seen enough of the industry to immediately feel suspicious when someone uses that sort of phrasing in an attempt to persuade me. It's no different from claiming a "better experience" or similar.


I made a library where version 2 is really really much faster than version 1. I'd want everyone to just move to version 2.


But then you are saying a specific great new feature, performance, and not just the claim and concept performance, but numbers.


I'm sure they actually had new features…


What are they specifically?

I don't know how you can be missing the essence of the problem here or that comments point.

Vague claims are meaningless and valueless and are now even worse than that, they are a red flag.

Please don't tell me that you would accept a pr that didn't explain what it did, and why it did it, and how it did it, with code that actually matched up with the claim, and was all actually something you wanted or agreed was a good change to your project.

Updating to the next version of a library is completely unrelated. When you update a library, you don't know what all the changes were to the library, _but the librarys maintainers do_, and you essentially trust that librarys maintainers to be doing their job not accepting random patches that might do anything.

Updating a dependency and trusting a project to be sane is entirely a different prospect from accepting a pr and just trusting that the submitter only did things that are both well intentioned and well executed.

If you don't get this then I for sure will not be using or trusting your library.


Yeah... RISCV routine was put in, then some binary test files were added later that are probably now suspect.

don't miss out on the quality code, like the line that has: i += 4 - 2;

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=50255feeaab...


FWIW, "4 - 2" is explained earlier in the file:

  // The "-2" is included because the for-loop will
  // always increment by 2. In this case, we want to
  // skip an extra 2 bytes since we used 4 bytes
  // of input.
  i += 4 - 2;


> some binary test files were added later that are probably now suspect

That's confirmed

From https://www.openwall.com/lists/oss-security/2024/03/29/4:

> The files containing the bulk of the exploit are in an obfuscated form in

> tests/files/bad-3-corrupt_lzma2.xz

> tests/files/good-large_compressed.lzma

> committed upstream. They were initially added in

> https://github.com/tukaani-project/xz/commit/cf44e4b7f5dfdbf...


It probably makes sense to start isolating build processes from test case resources.


Sure but then you can smuggle it into basically any other part of the build process…?


You can find more examples of that kind of puffer if you go to a website's cookie consent pop-up and find the clause after "we use cookies to...".


I’ve long thought that those “this new version fixes bugs and improves user experience” patch notes that Meta et al copy and paste on every release shouldn’t be permitted.


Tell me about it. I look at all these random updates that get pushed to my mobile phone and they all pretty much have that kind of fluff in the description. Apple/Android should take some steps to improve this or outright ban this practice. In terms of importance to them though I imagine this is pretty low on the list.

I have dreamed about an automated LLM system that can "diff" the changes out of the binary and provide some insight. You know give back a tiny bit of power to the user. I'll keep dreaming.


It's worse, as someone who does try to privide release notes I'm often cut off by the max length of the field. And even then, Play only shows you the notes for the latest version of the app.


Slack's Mac app release notes [1] rotate a few copy pastes, here's the one that shits me the most.

> We tuned up the engine and gave the interiors a thorough clean. Everything is now running smoothly again.

Yeah nah mate, if every release is the first release where everything is running smoothly, I'm not going to believe it this time either.

Makes me wonder if the team has some release quota to fill and will push a build even if nothing meaningful has actually changed.

[1] https://slack.com/release-notes/mac


Ugh. That's especially annoying because they're trying to be hip with slang and use a metaphor that requires cultural knowledge that you can't really assume everyone has.


Interesting that one of the commits commented on update of the test file that it was for better reproducibility for having been generated by a fixed random seed (although how goes unmentioned). For the future, random test data better be generated as part of the build, rather than being committed as opaque blobs...


I agree on principle, but sometimes programmatic generating test data is not so easy.

E.g.: I have a specific JPEG committed into a repository because it triggers a specific issue when reading its metadata. It's not just _random_ data, but specific bogus data.

But yeah, if the test blob is purely random, then you can just commit a seed and generate in during tests.


Debian have reverted xz-utils (in unstable) to 5.4.5 – actual version string is “5.6.1+really5.4.5-1”. So presumably that version's safe; we shall see…


Is that version truly vetted? "Jia Tan" has been the official maintainer since 5.4.3, could have pushed code under any other pseudonym, and controls the signing keys. I would have felt better about reverting farther back, xz hasn't had any breaking changes for a long time.


It looks like this is being discussed, with a complication of additional symbols that were introduced https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068024


Thanks for this! I found this URL in the thread very interesting!

https://www.nongnu.org/lzip/xz_inadequate.html


It is an excellent technical write-up and yet again another testimonial to the importance of keeping things simple.


The other comments here showing that the backdoor was a long-term effort now make me wonder just how long of an effort it was...


It's not only that account, other maintainer has been pushing the same promotion all over the place.


TIL that +really is a canonical string. [0]

[0]: https://www.debian.org/doc/debian-policy/ch-controlfields.ht...


There are suggestions to roll back further


After reading the original post by Andres Freund, https://www.openwall.com/lists/oss-security/2024/03/29/4, his analysis indicates that the RSA_public_decrypt function is being redirected to the malware code. Since RSA_public_decrypt is only used in the context of RSA public key - private key authentication, can we reasonably conclude that the backdoor does not affect username-password authentication?


Isn't it rather that the attacker can log in to the compromised server by exploiting the RSA code path?


I’m surprised there isn’t way more of this stuff. The supply chain is so huge and therefore represents so much surface area.


There probably is. Way more than anyone knows. I bet every major project on github is riddled with state actors.


Imagine if sshd was distributed by PyPI or cargo or npm instead of by a distro.


Github accounts of both xz maintainers have been suspended.


Not true, the original author wasn't suspended: https://github.com/Larhzu

https://github.com/JiaT75 was suspended for a moment, but isn't anymore?


GitHub’s UI has been getting notoriously bad for showing consistent and timely information lately, could be an issue stemming from that.


Yeah. Had a weird problem last week where GitHub was serving old source code from the raw url when using curl, but showing the latest source when coming from a browser.

Super frustrating when trying to develop automation. :(


Both are suspended for me. Check followers on both accounts, both have a suspended pill right next to their names.


Ah, thanks for correcting me there - really weird that this isn't visible from the profile itself. Not even from the organization.

The following page for each other show both accounts suspended indeed.

https://github.com/Larhzu?tab=following

https://github.com/JiaT75?tab=following


Lasse's account was restored


github should add a badge for "inject backdoor into core open source infrastructure"


Hey maybe it would get bad actors to come clean trying to get that badge.


These shouldn't be suspended, and neither should their repositories. People might want to dig through the source code. It's okay if they add a warning on the repository, but suspending _everything_ is a stupid thing to do.


Tools don't read warnings. Of course the information should not be hidden completely but intentionally breaking the download URLs makes sense.


This can also be handled relatively easily. They can disable the old links and a new one can be added specifically for the disabled repository. Or even just let the repository be browsable through the interface at least.

Simply showing one giant page saying "This respository is disabled" is not helpful in any way.


Do you know if it was actually the commit author, of if their commit access was compromised?


If it was a compromise it also included the signing keys as the release tarball was modified vs the source available on GitHub.


Nice. I worked on a Linux disto when I was a wee lad and all we did was compute a new md5 and ship it.


Name and shame this author. They should never be allowed anywhere near any open projects ever again.


Please don't?

1. You don't actually know what has been done by whom or why. You don't know if the author intended all of this, or if their account was compromised. You don't know if someone is pretending to be someone else. You don't know if this person was being blackmailed, forced against their will, etc. You don't really know much of anything, except a backdoor was introduced by somebody.

2. Assuming the author did do something maliciously, relying on personal reputation is bad security practice. The majority of successful security attacks come from insiders. You have to trust insiders, because someone has to get work done, and you don't know who's an insider attacker until they are found out. It's therefore a best security practice to limit access, provide audit logs, sign artifacts, etc, so you can trace back where an incursion happened, identify poisoned artifacts, remove them, etc. Just saying "let's ostracize Phil and hope this never happens again" doesn't work.

3. A lot of today's famous and important security researchers were, at one time or another, absolute dirtbags who did bad things. Human beings are fallible. But human beings can also grow and change. Nobody wants to listen to reason or compassion when their blood is up, so nobody wants to hear this right now. But that's why it needs to be said now. If someone is found guilty beyond a reasonable doubt (that's really the important part...), then name and shame, sure, shame can work wonders. But at some point people need to be given another chance.


100% fair -- we don't know if their account was compromised or if they meant to do this intentionally.

If it were me I'd be doing damage control to clear my name if my account was hacked and abused in this manner.

Otherwise if I was doing this knowing full well what would happen then full, complete defederation of me and my ability to contribute to anything ever again should commence -- the open source world is too open to such attacks where things are developed by people who assume good faith actors.


upon further reflection all 3 of your points are cogent and fair and valid. my original point was a knee-jerk reaction to this. :/


Your being able to reflect upon it and analyze your own reaction is rare, valuable and appreciated


I think I went through all the stages of grief. Now at the stage of acceptance here’s what I hope: I hope justice is done. Whoever is doing this be they a misguided current black hat (hopefully, future white hat) hacker, or just someone or someones that want to see the world burn or something in between that we see justice. And then forgiveness and acceptance and all that can happen later.

Mitnick reformed after he was convicted (whether you think that was warranted or not). Here if these folks are Mitnick’s or bad actors etc let’s get all the facts on the table and figure this out.

What’s clear is that we all need to be ever vigilant: that seemingly innocent patch could be part of a more nefarious thing.

We’ve seen it before with that university sending patches to the kernel to “test” how well the core team was at security and how well that went over.

Anyways. Yeah. Glad you all allowed me to grow. And I learned that I have an emotional connection to open source for better or worse: so much of my life professional and otherwise is enabled by it and so threats to it I guess I take personally.


It is reasonable to consider all commits introduced by the backdoor author untrustworthy. This doesn't mean all of it is backdoored, but if they were capable of introducing this backdoor, their code needs scrutiny. I don't care why they did it, whether it's a state-sponsored attack, a long game that was supposed to end with selling a backdoor for all Linux machines out there for bazillions of dollars, or blackmail — this is a serious incident that should eliminate them from open-source contributions and the xz project.

There is no requirement to use your real name when contributing to open source projects. The name of the backdoor author ("Jia Tan") might be fake. If it isn't, and if somehow they are found to be innocent (which I doubt, looking at the evidence throughout the thread), they can create a new account with a new fake identity.


They might have burnt the reputation built for this particular pseudonym but what is stopping them from doing it again? They were clearly in it for the long run.


You're assuming that it's even a single person, it's just a gmail address and an avatar with a j icon from a clip art thing.


I literally said "they", I know, I know, in English that can also be interpreted as a gender unspecific singular.

Anyways, yes it is an interesting question whether he/she is alone or they are a group. Conway's law probably applies here as well. And my hunch in general is that these criminal mad minds operate individually / alone. Maybe they are hired by an agency but I don't count that as a group effort.


Can legal action be taken against the author if it's found he maliciously added the backdoor?


Good luck with that. We don't even know what country is he from. Probably from China but even if so. Good luck finding him among 1.5 Billions.


It is not good to take into consideration something with any unreadable text instead of the open text of the programme. It should be excluded.


I wonder who the target was!


Every Linux box inside AWS, Azure, and GCP and other cloud providers that retains the default admin sudo-able user (e.g., “ec2”) and is running ssh on port 22.

I bet they intended for their back door to eventually be merged into the base Amazon Linux image.


You don't need a "ec2" user. A backdoor can just allow root login even when that is disabled for people not using the backdoor.

It just requires the SSH port to be reachable unless there is also a callout function (which is risky as people might see the traffic). And with Debian and Fedora covered and the change eventually making its way into Ubuntu and RHEL pretty much everything would have this backdoor.


my understanding is that any Debian/RPM-based Linux running sshd would become vulnerable in a year or two. The best equivalent of this exploit is the One Ring.

So the really strange thing is why they put so little effort into making this undetectable. All they needed was to make it use less time to check each login attempt.


In the other hand it was very hard to detect. The slow login time was the only thing that gave it away. It more seems like they were so close to being highly successful. In retrospect improving the performance would have been the smart play. But that is one part that went wrong compared to very many that went right.


Distro build hosts and distro package maintainers might not be a bad guess. Depends on whether getting this shipped was the final goal. It might have been just the beginning, part of some bootstrapping.


Probably less of an individual and more of an exploit to sell.


his account is active again on github https://github.com/JiaT75


Sleeper.


[flagged]


Not sure why are people downvoting you... it's pretty unlikely that various Chinese IoT companies would just decide it's cool to add a backdoor, which clearly implies that no matter how good their intentions are, they simply might have no other choice.


There are roughly speaking two possibilities here:

1. His machine was compromised, he wasn't at fault past having less than ideal security (a sin we are all guilty of). His country or origin/residence is of no importance and doxing him isn't fair to him.

2. This account was malicious. There's no reason we should believe that the identity behind wasn't fabricated. The country of origin/residence is likely falsified.

In neither case is trying to investigate who he is on a public forum likely to be productive. In both cases there's risk of aiming an internet mob at some innocent person who was 'set up'.


The back door is in the upstream GitHub tarball. The most obvious way to get stuff there is by compromising an old style GitHub token. The new style GitHub tokens are much better but it’s somewhat intransparent what options you need. Most people also don’t use expiring tokens. The authors seems to have a lot of oss contributions, so probably an easy target to choose.


Why do you exclude the possibility that this person was forced to add this at gunpoint?


Yes exactly this. How do people think state actors have all those 0 day exploits. Excellent research? No! They are adding them themselves!


I think the letters+numbers naming scheme for both the main account and the sockpuppets used to get him access to xz and the versions into distros is a strong hint at (2). Taking over xz maintainership without any history of open source contributions is also suspicious.


Because it’s naive to think that the owner of the account used his real identity.


But my point is that people living in China might be "forced" to do such things, so we unfortunately can't ignore the country. Of course, practically this is problematic since the country can be faked


[flagged]


Don't blame the guy. Could have happened to anyone. Even you.


[flagged]


It's an uncharitable summary that comes across as a personal attack, which is not allowed in HN comments.


the account was either sold or stolen


That's pure speculation and there are plenty of hints to the contrary.


Fascinating. Just yesterday the author added a `SECURITY.md` file to the `xz-java` project.

> If you discover a security vulnerability in this project please report it privately. *Do not disclose it as a public issue.* This gives us time to work with you to fix the issue before public exposure, reducing the chance that the exploit will be used before a patch is released.

Reading that in a different light, it says give me time to adjust my exploits and capitalize on any targets. Makes me wonder what other vulns might exist in the author's other projects.


Security Researchers: Is this request-for-private-disclosure + "90-days before public" reasonable?

It's a SEVERE issue, to my mind, and 90 days seems too long to me.


In this particular case, there is a strong reason to expect exploitation in the wild to already be occurring (because it's an intentional backdoor) and this would change the risk calculus around disclosure timelines.

But in the general case, it's normal for 90 days to be given for the coordinated patching of even very severe vulnerabilities -- you are giving time not just to the project maintainers, but to the users of the software to finish updating their systems to a new fixed release, before enough detail to easily weaponize the vulnerability is shared. Google Project Zero is an example of a team with many critical impact findings using a 90-day timeline.


As someone in security who doesn't work at a major place that get invited to the nice pre-notification notifications, I hate this practice.

My customers and business are not any less important or valuable than anyone else's, and I should not be left being potentially exploited, and my customers harmed, for 90 more days while the big guys get to patch their systems (thinking of e.g. Log4J, where Amazon, Meta, Google, and others were told privately how to fix their systems, before others were even though the fix was simple).

Likewise, as a customer I should get to know as soon as someone's software is found vulnerable, so I can then make the choice whether to continue to subject myself to the risk of continuing to use it until it gets patched.


> My ... business are not any less ... valuable than anyone else's,

Plainly untrue. The reason they keep distribution minimal is to maximise the chance of keeping the vuln secret. Your business is plainly less valuable than google, than walmart, than godaddy, than BoA. Maybe you're some big cheese with a big reputation to keep, but seeing as you're feeling excluded, I guess these orgs have no more reason to trust you than they have to trust me, or hundreds of thousands of others who want to know. If they let you in, they'd let all the others in, and odds are greatly increased that now your customers are at risk from something one of these others has worked out, and either blabbed about or has themselves a reason to exploit it.

Similarly plainly, by disclosing to 100 major companies, they protect a vast breadth of consumers/customer-businesses of these major companies at a risk of 10,000,000/100 (or even less, given they may have more valuable reputation to keep). Changing that risk to 12,000,000/10,000 is, well, a risk they don't feel is worth taking.


> Your business is plainly less valuable than google, than walmart, than godaddy, than BoA.

The company I work for has a market cap roughly 5x that of goDaddy and we're responsible for network connected security systems that potentially control whether a person can physically access your home, school, or business. We were never notified of this until this HN thread.

If your BofA account gets hacked you lose money. If your GoDaddy account gets hacked you lose your domain. If Walmart gets hacked they lose... What money and have logistics issues for a while?

Thankfully my company's products have additional safeguards and this isn't a breach for us. But what if it was? Our customers can literally lose their lives if someone cracks the security and finds a way to remotely open all the locks in their home or business.

Don't tell me that some search engine profits or someone's emails history is "more valuable" than 2000 schoolchildren's lives.

How about you give copies of the keys to your apartment and a card containing your address to 50 random people on the streets and see if you still feel that having your Gmail account hacked is more valuable.


I think from an exposure point of view, I'm less likely to worry about the software side of my physical security being exploited that the actual hardware side.

None of the points you make are relevant since I have yet to see any software based entry product whose software security can be concidered more than lackluster at best, maybe your company is better since you didn't mention a name I can't say otherwise.

What I'm saying is your customers are more likely to have their doors physically broken than remotely opened by software and you are here on about life and death because of a vuln in xz?

If your companies market cap is as high as you say and they are as security aware as you say why aren't they employing security researchers and actively on the forefront of finding vulns and reporting them? That would get them an invite to the party.


Sorry, but that's not a serious risk analysis. The average person would be hurt a lot more by a godaddy breach by a state actor than by a breach of your service by a state actor.


Man if it was ever appropriate to tell someone to touch grass this would be it.

The think of the children part is a nice touch as well. 10/10 copypasta would repost.


> Your business is plainly less valuable than google, than walmart, than godaddy, than BoA.

Keep in mind it's the EROI not market cap.

A company is worth attacking if their reward:effort ratio is right. Smaller companies have a much lower effort required.


Being in a similar boat, I heartily agree.

But I don't want anyone else to get notified immediately because the odds that somebody will start exploiting people before a patch is available is pretty high. Since I can't have both, I will choose the 90 days for the project to get patches done and all the packagers to include them and make them available, so that by the time it's public knowledge I'm already patched.

I think this is a Tragedy of the Commons type of problem.

Caveat: This assume the vuln is found by a white hat. If it's being exploited already or is known to others, then I fully agree the disclosure time should be eliminated and it's BS for the big companies to get more time than us.


OpenSSL's "notification of an upcoming critical release" is public, not private.

You do get to know that the vulnerability exists quickly, and you could choose to stop using OpenSSL altogether (among other mitigations) once that email goes out.


if your system has already been compromised at the root level, it does not matter in the least bit


Well if you assume everyone has already been exploited, disclosing quickly vs slowly won't prevent that.

Also, if something is being actively exploited, usually there's no or very little embargo.


Yeah I worked in FAANG when we got the advance notice of a number of CVEs. Personally I think it's shady, I don't care how big Amazon or Google is, they shouldn't get special privileges because they are a large corporation.


I don't think the rationale is that they are a large corporation or have lots of money. It's that they have many, many, many more users that would be affected than most companies have.


I imagine they also have significant resources to contribute to dealing with breaches - eg, analysing past cookouts by the bad actor, designing mitigations, etc.


I empathize with this as I've been in the same boat, but all entities are not equal when performing triage.


> My customers and business are not any less important or valuable than anyone else's

Hate to break it to you but yes they are.


> My customers and business are not any less important or valuable than anyone else's

Of course they are. If Red Hat has a million times more customers than you do then they are collectively more valuable almost by definition.


If OP is managing something that is critical to life - think fire suppression controllers, or computers that are connected to medical equipment, I think it becomes very difficult to compare that against financial assets.


At a certain scale, "economic" systems become critical to life. Someone who has sufficiently compromised a systemically-important bank can do things that would result in riots breaking out on the street all over a country.


You could use the EPA dollar to life conversion ratio.

Though anything actually potentially lethal shouldn't really have a standard Internet connection. E.g. nuclear power plants, trains, planes controls, heavy industrial equipment, nuclear weapons...


Something that is critical to life should not be connected to Internet.


And yet it seems like every new car is.


Sshhh now you are starting to talk like a rightwinger. Alex Jones has been saying this for a long time ;)


Such systems should be airgapped…


In that case OP should not design systems were a sshd compromise can have a life-threatening impact. Just because it's easier for everything to be controlled from the cloud doesn't mean that others need to feel sympathy when that turnes out to be as bad of an idea as everyone else has said.


I can think of two approaches for such companies:

a. Use commercial OS vendors who will push out fixes.

b. Set up a Continuous Integration process where everything is open source and is built from the ground up, with some reliance on open source platforms such as distros.

One needs different types of competence and IT Operational readiness in each approach.


> b. Set up a Continuous Integration process where everything is open source and is built from the ground up, with some reliance on open source platforms such as distros.

How would that have prevented this backdoor?


> but to the users of the software to finish updating their systems to a new fixed release,

Is there "a new fixed release" ?


Whether its reasonable is debatable, but that type of time frame is pretty normal for things that aren't being actively exploited.

This situation is perhaps a little different as its not an accidental bug waiting to be discovered but an intentionally placed exploit. We know that a malicious person already knows about it.


Detecting a security issue is one thing. Detecting a malicious payload is something completely different. The latter has intent to exploit and must be addressed immediately. The former has at least some chance of noone knowing about it.


If you were following Google Project Zero's policy (which many researchers do), any in-the-wild exploits would trigger an immediate reveal.


I think you have to take the credibility of the maintainer into account.

If it's a large company, made of people with names and faces, with a lot to lose by hacking its users, they're unlikely to abuse private disclosure. If it's some tiny library, the maintainers might be in on it.

Also, if there's evidence of exploitation in the wild, the embargo is a gift to the attacker. The existence of a vulnerability in that case should be announced, even if the specifics have to be kept under embargo.


In this case the maintainer is the one who deliberately introduced the backdoor. As Andres Freund puts it deadpan, "Given the apparent upstream involvement I have not reported an upstream bug."


imho it depends on the vuln. I've given a vendor over a year, because it was a very low risk vuln. This isn't a vuln though - this is an attack.


> imho it depends on the vuln. I've given a vendor over a year, because it was a very low risk vuln.

But why? A year is a ridiculous time for fixing a vulnerability even a minor one. If a vendor is taking that long its because they don't prioritize security at all and are just dragging their feet.


The fraudulent author must have enjoyed the 'in joke' -- He's the one create vulnerabilities..


I've always laughed my ass off at the idea of a disclosure window. It takes less than a day to find RCE that grants root privileges on devices that I've bothered to look at. Why on earth would I bother spending months of my time trying to convince someone to fix something?


90 day dark window for maintainers is SOP though. Then after 90 days, it’s free game for public disclosure


How many of people like this one exist?


If this question had a reliable (and public) answer then the world would be a very different place!

That said, this is an important question. We, particularly those us who work on critical infrastructure or software, should be asking ourselves this regularly to help prevent this type of thing.

Note that it's also easy (and similarly catastrophic) to swing too far the other way and approach all unknowns with automatic paranoia. We live in a world where we have to trust strangers every day, and if we lose that option completely then our civilization grinds to a halt.

But-- vigilance is warranted. I applaud these engineers who followed their instincts and dug into this. They all did us a huge service!

EDIT: wording, spelling


Yeah thanks for saying this; I agree. And as cliche as it is to look for a technical solution to a social problem, I also think better tools could help a lot here.

The current situation is ridiculous - if I pull in a compression library from npm, cargo or Python, why can that package interact with my network, make syscalls (as me) and read and write files on my computer? Leftpad shouldn’t be able to install crypto ransomware on my computer.

To solve that, package managers should include capability based security. I want to say “use this package from cargo, but refuse to compile or link into my binary any function which makes any syscall except for read and write. No open - if I want to compress or decompress a file, I’ll open the file myself and pass it in.” No messing with my filesystem. No network access. No raw asm, no trusted build scripts and no exec. What I allow is all you get.

The capability should be transitive. All dependencies of the package should be brought in under the same restriction.

In dynamic languages like (server side) JavaScript, I think this would have to be handled at runtime. We could add a capability parameter to all functions which issue syscalls (or do anything else that’s security sensitive). When the program starts, it gets an “everything” capability. That capability can be cloned and reduced to just the capabilities needed. (Think, pledge). If I want to talk to redis using a 3rd party library, I pass the redis package a capability which only allows it to open network connections. And only to this specific host on this specific port.

It wouldn’t stop all security problems. It might not even stop this one. But it would dramatically reduce the attack surface of badly behaving libraries.


Doesn't this exact exploit not fixed by your capability theory?

It is hijacking a process that has network access at runtime not build time.

The build hack grabs files from the repo and inspects build parameters (in a benign way, everyone checks whether you are running on X platform etc)


The problem we have right now is that any linked code can do anything, both at build time and at runtime. A good capability system should be able to stop xz from issuing network requests even if other parts of the process do interact with the network. It certainly shouldn't have permission to replace crc32_resolve() and crc64_resolve() via ifunc.

Another way of thinking about the problem is that right now every line of code within a process runs with the same permissions. If we could restrict what 3rd party libraries can do - via checks either at build time or runtime - then supply chain attacks like this would be much harder to pull off.


I'm not convinced this is such a cure-all as any library must necessarily have the ability to "taint" its output. Like consider this library. It's a compression library. You would presumably trust it to decompress things right? Like programs? And then you run those programs with full permission? Oops..


It’s not a cure-all. I mean, we’re talking about infosec - so nothing is. But that said, barely any programs need the ability to execute arbitrary binaries. I can’t remember the last time I used eval() in JavaScript.

I agree that it wouldn’t stop this library from injecting backdoors into decompressed executables. But I still think it would be a big help anyway. It would stop this attack from working.

At the big picture, we need to acknowledge that we can’t implicitly trust opensource libraries on the internet. They are written by strangers, and if you wouldn’t invite them into your home you shouldn’t give them permission to execute arbitrary code with user level permissions on your computer.

I don’t think there are any one size fits all answers here. And I can’t see a way to make your “tainted output” idea work. But even so, cutting down the trusted surface area from “leftpad can cryptolocker your computer” to “Leftpad could return bad output” sounds like it would move us in the right direction.


There are attacks that embed hacks into built compilers so unless you are looking to write your software from scratch you need to trust people.

And by scratch I mean "without modern hardware" given supply chain attacks also apply to the hardware you build from.


Of course we need to trust people to some degree. There's an old Jewish saying - put your trust in god, but your money in the bank. I think its like that. I'm all for trusting people - but I still like how my web browser sandboxes every website I visit. That is a good idea.

We (obviously) put too much trust in little libraries like xz. I don't see a world in which people start using fewer dependencies in their projects. So given that, I think anything which makes 3rd party dependencies safer than they are now is a good thing. Hence the proposal.

The downside is it adds more complexity. Is that complexity worth it? Hard to say. Thats still worth talking about.


i guess the big opensource community should put a little bit more trust in statistics or integrate statistic evaluation in their decission making to use specific products in their supply chains.

there are some researches on the right track already https://www.se.cs.uni-saarland.de/projects/congruence/


This approach could work for dynamic libraries, but a lot of modern ecosystems (Go, Rust, Swift) prefer to distribute packages as source code that gets compiled with the including executable or library.


Yes, and?

The goal is to restrict what included libraries can do. As you say, in languages like Rust, Go or Swift, the mechanism to do this would also need to work with statically linked code to work. And thats quite tricky, because there are no isolation boundaries between functions in executables.

It should still be possible to build something like this. It would just be inconvenient. In rust, swift and go you'd probably want to implement something like this at compile time.

In rust, I'd start by banning unsafe in dependencies. (Or whitelisting which projects are allowed to use unsafe code.) Then add special annotations on all the methods in the standard library which need special permissions to run. For example, File::open, fork, exec, networking, and so on. In cargo.toml, add a way to specify which permissions your child libraries get. "Import serde, but give it no OS permissions". When you compile your program, the compiler can look at the call tree of each function to see what actually gets called, and make sure the permissions match up. If you call a function in serde which in turn calls File::open (directly or indirectly), and you didn't explicitly allow that, the program should fail to compile.

It should be fine for serde to contain some utility function that calls the banned File::open, so long as the utility function isn't called.

Permissions should be in a tree. As you get further out in the dependency tree, libraries get fewer permissions. If I pass permissions {X,Y} to serde, serde can pass permission {X} to one of its dependencies in turn. But serde can't pass permission {Q} to its dependency - since it doesn't have that capability itself.

Any libraries which use unsafe are sort of trusted to do everything. You might need to insist that any package which calls unsafe code is actively whitelisted by the cargo.toml file in the project root.


>It should still be possible to build something like this. It would just be inconvenient.

Inconvenient is quite the understatement. Designing and implementing something like this for each and every language compiler/runtime requires hugely more effort than doing it on the OS level. The likelihood of mistakes is also far greater.

Perhaps it's worth exploring whether it can be done on the LLVM level so that at least some languages can share an implementation.


Do you understand how ifuncs work? They are in the address space in the application that they run in. liblzma is resolving its own pointers!


if I got it right, the attack uses glibc IFUNC mechanism to patch sshd (and only sshd) to directly run some code in liblzma when sshd verifies logins.

so the problem is IFUNC mechanism, which has its valid uses but can be EASILY misused for any sort of attacks


A process can do little to defend itself from a library it's using which has full access to its same memory. There is no security boundary there. This kind of backdoor doesn't hinge on IFUNC's existence.


Honestly, I don't have a lot of hope that we can fix this problem for C on linux. There's just so much historical cruft in present, spread between autotools, configure, make, glibc, gcc and C itself that would need to be modified to support capabilities.

The rule we need is "If I pull in library X with some capability set, then X can't do anything not explicitly allowed by the passed set of capabilities". The problem in C is that there is currently no straightforward way to firewall off different parts of a linux process from each other. And dynamic linking on linux is done by gluing together compiled artifacts - with no way to check or understand what assembly instructions any of those parts contain.

I see two ways to solve this generally:

- Statically - ie at compile time, the compiler annotates every method with a set of permissions it (recursively) requires. The program fails to compile if a method is called which requires permissions that the caller does not pass it. In rust for example, I could imagine cargo enforcing this for rust programs. But I think it would require some changes to the C language itself if we want to add capabilities there. Maybe some compiler extensions would be enough - but probably not given a C program could obfuscate which functions call which other functions.

- Dynamically. In this case, every linux system call is replaced with a new version which takes a capability object as a parameter. When the program starts, it is given a capability by the OS and it can then use that to make child capabilities passed to different libraries. I could imagine this working in python or javascript. But for this to work in C, we need to stop libraries from just scanning the process's memory and stealing capabilities from elsewhere in the program.


Or take the Chrome / original Go approach: load that code in a different process, use some kind of RPC. With all the context switch penalty... sigh, I think it is the only way, as the MMU permissions work at a page level.


Firefox also has its solution of compiling dependencies to wasm, then compiling the wasm back into C code and linking that. It’s super weird, but the effect is that each dependency ends up isolated in bounds checked memory. No context switch penalty, but instead the code runs significantly slower.


The problem is that the attacker has code execution in sshd, not ifuncs


> We, particularly those us who work on critical infrastructure or software

We should also be asking ourselves if we are working on critical infrastructure. Lasse Collin probably did not consider liblzma being loaded by sshd when vetting the new maintainer. Did the xz project ever agree to this responsibility?

We should also be asking ourselfs if each dependency of critical infrastructure is worth the risk. sshd linking libsystemd just to write a few bytes into an open fd is absurd. libsystemd pulling in liblzma because hey it also does compressed logging is absurd. Yet this kind of absurd dependency bloat is everywhere.


Assume 3% of the population is malicious.

Enough to be cautious, enough to think about how to catch bad actors, not so much as to close yourself off and become a paranoid hermit.


We live in a time of populous, wealthy dictatorships that have computer-science expertise are openly hostile to the US and Canada.

North America is only about 5% of the world's population. [1] (We can assume that malicious actors are in North America, too, but this helps to adjust our perspective.)

The percentage of maliciousness on the Internet is much higher.

[1] _ See continental subregions. https://en.wikipedia.org/wiki/List_of_continents_and_contine...


The US government itself is openly hostile to the US (as well as to the rest of the world).

> The percentage of maliciousness on the Internet is much higher.

A baseless assumption.


Huh? The empirical evidence we have - thanks to Snowden leaks - paints a different picture. NSA is the biggest malicious actor with nearly unlimited resources at hand. They even insert hardware backdoors and intercept shipment to do that.


> NSA is the biggest malicious actor

I'm curious, how do you rank CN, RU, and IR?


They are all active players, but no where close to the top dog.


That doesn't mean that the entire population of those countries is actively hostile to you.


[flagged]


> If you put it in other words

You may of course choose whatever words you like. But your statement is nonsense, and the failures in the Middle East are all humanity's fault.

The kakistocracies of CN, RU, and IR are not the feeble foes of democracy and capitalism that they were in the 20th century.


Huh. I never really thought of it as a percentage.

I've been evil, been wonderful, and indifferent at different stages in life.

I have known those who have done similar for money, fame, and boredom.

I think, given a backstory, incentive, opportunity, and resources it would be possible to most people to flip from wouldn't to enlisted.

Leverage has shown to be the biggest lever when it comes to compliance.


It's doubtful you've been evil, or at least, you are really lacking in imagination of the true scope of what that word implies.


"Assume that 3% of the people you encounter will act maliciously."


The line between good and evil cuts through the heart of every person


Threat actors create personas. We will need strong social trust to protect our important projects and dependencies.


> How many of people like this one exist?

I guess every 3 letter agency has at least one. You can do the math. They havent't learned anything after Solar Winds.


Honestly it seems like a state-based actor hoping to get whatever high value target compromised before it's made public. Reporting privately buys them more time, and allows them to let handlers know when the jig is up.


Looks like one of the backdoor authors even went and disabled the feature the exploit relied on directly on oss-fuzz to prevent accidental discovery: https://social.treehouse.systems/@Aissen/112180302735030319 https://github.com/google/oss-fuzz/pull/10667

But luckily there was some serendipity: "I accidentally found a security issue while benchmarking postgres changes." https://mastodon.social/@AndresFreundTec/112180083704606941


This is getting addressed here: https://github.com/google/oss-fuzz/issues/11760


This in of itself can be legitimate. ifunc has real uses and it indeed does not work when sanitizer is enabled. Similar change in llvm: https://github.com/llvm/llvm-project/commit/1ef3de6b09f6b21a...


Because of the exploit, so, why should we use configurations in production that were not covered by these tests?


Could that commit also be made by a bad actor?


and that was in mid 2023. Very funny that Wikipedia on this issue says

> It is unknown whether this backdoor was intentionally placed by a maintainer or whether a maintainer was compromised

Yeah, if you've been compromised for a year your attacker is now your identity. Can't just wave hands, practice infosec hygiene


I've long since said that if you want to hide something nefarious you'd do that in the GNU autoconf soup (and not in "curl | sh" scripts).

Would be interesting to see what's going on here; the person who did the releases has done previous releases too (are they affected?) And has commits going back to 2022 – relatively recent, but not that recent. Many are real commits with real changes, and they have commits on some related projects like libarchive. Seems like a lot of effort just to insert a backdoor.

Edit: anyone with access can add files to existing releases and it won't show that someone else added it (I just tested). However, the timestamp of the file will be to when you uploaded it, not that of the release. On xz all the timestamps of the files match with the timestamp of the release (usually the .tar.gz is a few minutes earlier, which makes sense). So looks like they were done by the same person who did the release. I suspected someone else might have added/altered the files briefly after the release before anyone noticed, but that doesn't seem to be the case.


> I've long since said that if you want to hide something nefarious you'd do that in the GNU autoconf soup (and not in "curl | sh" scripts).

Yeah, I've been banging on that same drum for ages too... for example on this very site a decade ago: https://news.ycombinator.com/item?id=7213563

I'm honestly surprised that this autoconf vector hasn't happened more often... or more often that we know of.


Given that this was discovered by sheer luck, I'd expect way more such exploits in the wild.


Every single commit this person ever did should immediately be rolled back in all projects.


It's weird and disturbing that this isn't the default perspective.


Well, it is much easier said than done. Philosophically I agree, but in the real world where you have later commits that might break and downstream projects, etc, it isn't very practical. It strikes me as in a similar vein to high school students and beauty pageant constestants calling for world peace. Really great goal, not super easy to implement.

I would definitely be looking at every single commit though and if it isn't obviously safe I'd be drilling in.


Some of those commits might fix genuine vulnerabilities. So you might trade a new backdoor for an old vulnerability that thousands of criminal orgs have bots for exploiting.

Damage wise, most orgs aren't going to be hurt much by NSA or the Chinese equivalent getting access, but a Nigerian criminal gang? They're far more likely to encrypt all your files and demand a ransom.


Still.. At this point the default assumption should be every commit is a vulnerability or facilitating a potential vulnerability.

For example, change from safe_fprintf to fprintf. It would be appropriate that every commit should be reviewed and either tweaked or re-written to ensure the task is being done in the safest way and doesn't have anything that is "off" or introducing a deviation from the way that codebase standardly goes about tasks within functions.


Surely this is happening right now.

A lot of eyes are on the code. From all sides. Folks trying to find old unpatched backdoors to exploit or patch.


it's not weird at all?

randomly reverting two years of things across dozens of repositories will break them, almost definitely make them unbuildable, but also make them unreleasable in case any other change needs to happen soon.

all of their code needs to be audited to prove it shouldn't be deleted, of course, but that can't happen in the next ten minutes.

I swear that HN has the least-thought-through hot takes of any media in the world.


* I swear that HN has the least-thought-through hot takes of any media in the world.*

The irony is too good.


Yeah if you tried to revert stuff that was done weeks ago on a relatively small team you know how much painstaking work it can be.


You can't just go and rip out old code, it'll break everything else, you have to review each commit and decide what to do with each.


"immediately" could mean have humans swarm on the task and make a choice, as opposed to

    for commit in author_commits
        git revert $commit


Imagine someone tried to revert all the commits you ever did. Doesn't sound easy.


Too much fallout.


Rolling back two years worth of commits made by a major contributor is going to be hell. I'm looking forward to see how they'll do this.


Not really. xz worked fine 2 years ago. Roll back to 5.3.1 and apply a fix for the 1 security hole that was fixed since that old version. (ZDI-CAN-16587)

Slight oversimplification, see https://bugs.debian.org/1068024 discussion.


This seems true with so many of these core libraries. Change for the sake of change introduces attack vectors. If it ain't broke, don't fix it!


Yeah but people will cry "dead project" if there hasn't been a release for a week.


Hoe will you do that practically though? That’s probably thousands of commits upon which tens or hundred thousand commits from others were built. You can’t just rollback everything two years and expect it not to break or bring back older vulnerabilities that were patched in those commits.


Likely part of what the attacker(s) are counting on. Anyone want to place odds this isn't the only thing that's going to be found?


I’d bet you at even odds that nothing else malicious by this person is found in 1 month, and at 1:2.5 odds that nothing is found in a year.


Only if you consider "this person" to be equal to "this identity".


I don’t thinks that’s necessary: there are enough eyes on this person’s work now.


No one will do it seriously


> they have commits on some related projects like libarchive

Windows started using libarchive to support .rar, .7z, ...

https://arstechnica.com/gadgets/2023/05/cancel-your-winrar-t...


Couldn't the autoconf soup be generated from simpler inputs by the CI/CD system to avoid this kind of problem? Incomprehensible soup as a build artifact (e.g. executables) is perfectly normal, but it seems to me that such things don't belong in the source code.

(This means you too, gradle-wrapper! And your generated wrapper for your generated wrapper. That junk is not source code and doesn't belong in the repo.)


Yes, it's usually regenerated already. However even the source is often pretty gnarly.

And in general, the build system of a large project is doing a lot of work and is considered pretty uninteresting and obscure. Random CMake macros or shell scripts would be just as likely to host bad code.

This is also why I like meson, because it's much more constrained than the others and the build system tends to be more modular and the complex parts split across multiple smaller, mostly independent scripts (written in Python or bash, 20-30 lines max). It's still complex, but I find it easier to organize.


> And in general, the build system of a large project is doing a lot of work and is considered pretty uninteresting and obscure. Random CMake macros or shell scripts would be just as likely to host bad code.

Build systems can even have undefined behaviour in the C++ sense. For example Conan 2 has a whole page on that.


The other thing besides the autoconf soup is the XZ project contains incomprehensible binaries as "test data"; the "bad-3-corrupt_lzma2.xz" part of the backdoor that they even put in the repo.

It's entirely possible they could have got that injection through review, even if they had that framwork and instead put it in source files used to generate autoconf soup.


gradle-wrapper is just a convenience, you can always just build the project with an installed version of gradle. Although I get your point, it’s a great place to hide nefarious code.


Pure speculation but my guess is a specific state actor ahem is looking for developers innocently working with open source to then strongarm them into doing stuff like this.


Or hiring them to do it for years without telling them why until they need a favor.


many people are patriots of their countries. if state agency would approach them proposing to have paid OSS work and help their country to fight terrorism/dictatorships/capitalists/whatever-they-believe, they will feel like killing two birds with one job


While this seems plausible, it is notable that this person seems to be anonymous from the get go. Most open source maintainers are proud of their work and maintain publicly available personas.


While I don't doubt there are people who would gladly do this work for money/patriotism/whatever, adding a backdoor to your own project isn't really reconcilable with the motivations behind wanting to do OSS work.


I would be curious if their commits could be analyzed for patterns that could then be used to detect commits from their other account


One thing that is annoying is that many open source projects have been getting "garbage commits" apparently from people looking to "build cred" for resumes or such.

Easier and easier to hide this junk in amongst them.


annoying ... and convenient for some!


There was a DARPA program on this topic called Social Cyber. [1]

1. https://www.darpa.mil/program/hybrid-ai-to-protect-integrity...


I mean, a backdoor at this scale (particularly if it wasn't noticed for a while and got into stable distros) could be worth millions. Maybe hundreds of millions (think of the insider trading possibilities alone, not to mention espionage). 2 years doesn't seem like that much work relative to the potential pay off.

This is the sort of case where america's over the top hacking laws make sense.


And what law would you use to target someone who wrote some code and posted it for free on the internet that was willingly consumed?


The computer abuse and fraud act? Seems like a pretty easy question to answer.


Maybe I'm miss understanding things, but it seems like anyone can publish an exploit on the internet without being a crime. In the same way encryption is free speech.

It would seem unlikely this guy would be also logging into peoples boxes after this.

It seems a much tougher job to link something like this to an intentional unauthorized access.

At this point, we have no confirmed access via compromise.

Do you know of a specific case where the existence of a backdoor has been prosecuted without a compromise?

Who would have standing to bring this case? Anyone with a vulnerable machine? Someone with a known unauthorized access. Other maintainers of the repo?

IANAL but it is unclear that a provable crime has been committed here


> IANAL

Best to leave it at that.

It's not worth your time or the reader's time trying to come up with a technicality to make it perfectly legal to do something we know little about, other than it's extremely dangerous.

Law isn't code, you gotta violate some pretty bedrock principles to pull off something like this and get away with it.

Yes, if you were just a security researcher experimenting on GitHub, it's common sense you should get away with it*, and yes, it's hard to define a logical proof that ensnares this person, and not the researcher.

* and yes, we can come up with another hypothetical where the security researcher shouldn't get away with it. Hypotheticals all the way down.


I think this thread is talking at cross-purposes.

1. It should be legal to develop or host pen-testing/cracking/fuzzing/security software that can break other software or break into systems. It should be illegal to _use_ the software to gain _unauthorised_ access to others' systems. (e.g. it's legal to create or own lockpicks and use them on your own locks, or locks you've been given permission to pick. It's not legal to gain unauthorised access _using_ lockpicks)

2. It should be illegal to develop malware that _automatically_ gains unauthorised access to systems (trojans, viruses, etc.). However, it should be legal to maintain an archive of malware, limiting access to vetted researchers, so that it can be studied, reverse-engineered and combatted. (e.g. it's illegal to develop or spread a bioweapon, but it's ok for authorised people to maintain samples of a bioweapon in order to provide antidotes or discover what properties it has)

3. What happened today: It should be illegal to intentionally undermine the security of a project by making bad-faith contributions to it that misrepresent what they do... even if you're a security researcher. It could only possibly be allowed done if an agreement was reached in advance with the project leaders to allow such intentional weakness-probing, with a plan to reveal the deception and treachery.

Remember when university researchers tried to find if LKML submissions could be gamed? They didn't tell the Linux kernel maintainers they were doing that. When the Linux kernel maintainers found out, they banned the entire university from making contributions and removed everything they'd done.

https://lkml.org/lkml/2021/4/21/454

https://arstechnica.com/gadgets/2021/04/linux-kernel-team-re...


Talking at cross-purposes?

No, people being polite and avoiding the more direct answer that'd make people feel bad.

The rest of us understand that intuitively, and that it is already the case, so pretending there was some need to work through it, at best, validates a misconception for one individual.

Less important, as it's mere annoyance rather than infohazard: it's wildly off-topic. Legal hypotheticals where a security researcher released "rm -rf *" on GitHub and ended up in legal trouble is 5 steps downfield even in this situation, and it is a completely different situation. Doubly so when everyone has to "IANAL" through the hypotheticals.


I'm not looking for a loophole or a legal hypothetical, I'm wondering if our laws are keeping up, which they very often do not with tech.

This is not unauthorized access, but is also clearly wrong. I'm wondering if its illegal, or if its unauthorized access . . .


And of course an attacker like this has a high likelihood of being a state actor, comfortably secure in their native jurisdiction.


> but it seems like anyone can publish an exploit on the internet without being a crime

Of course. The mere publishing of the exploit is not the criminal part. Its the manner & intent in which it was published that is the problem.

> At this point, we have no confirmed access via compromise.

While i don't know the specifics for this particular law, generally it doesn't matter what you actually did. What is relavent is what you tried to do. Lack of success doesn't make you innocent.

> Who would have standing to bring this case?

The state obviously. This is a criminal matter not a civil one. You don't even need the victim's consent to bring a case.

[IANAL]


Some types of criminal cases are only pursued on a victim's complaint.


Not this kind!

See for example page 35 in the Justice Department’s computer crimes handbook (dated, but basically AIUI the same way they still do things) [0]

[0] https://www.justice.gov/d9/criminal-ccips/legacy/2015/01/14/...


By this logic you could say that leaving a poisoned can of food in a public pantry is not a crime because poison is legal for academic purposes, and whoever ate it took it willingly.

Also, I think getting malicious code into a repo counts as a compromise in and of itself.


Or shooting someone isn't a crime, since you only pulled the trigger. After all it was the bullet that killed them


Similar laws we use to prosecute someone who intentionally brought a poisened cake to the potluck.


Are you suggesting intent is impossible to determine?


> I've long since said that if you want to hide something nefarious you'd do that in the GNU autoconf soup

If I recall correctly, xz can be built with both autoconf and cmake, are cmake configs similarly affected?


Yes, there is evidence of sabotage on the CMake configs too.

https://git.tukaani.org/?p=xz.git;a=commit;h=f9cf4c05edd14de...


How about wheels in the python ecosystem


Yeah this was my first thought too. Though I think the case against autoconf is already so overwhelming I think anyone still using it is just irredeemable; this isn't going to persuade them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: