Designing a Low Latency 10G Ethernet Core (2023)

(ttchisholm.github.io)

133 points | by picture 12 hours ago

3 comments

Neywiny 10 hours ago
It seems fun to be a high frequency FPGA trader designer. All my FPGA is much lower power consumption so I don't get to play with stuff like gigs of external SRAM or the QDR stuff or whatnot
[-]
- amelius 2 hours ago
  > It seems fun to be a high frequency FPGA trader designer.
  Yes, if you think that doing pointless work is fun. What do you tell your grandchildren?
  [-]
  - jbstack 1 hour ago
    It's a bit unfair to call it pointless. Empirical studies have shown that HFT increase liquidity in markets, leading to narrowed spreads, market depth, faster execution speeds, cross-market informational efficiency, etc. All of these are generally beneficial to other non-HFT market participants.
    Now you might counter that it also brings some negatives (e.g. fleeting liquidity, market manipulation risk, etc.) and there's certainty some validity to that. But none of this supports an argument that it's pointless.
    More generally, activities aren't pointless if you are earning money from it which pays your rent, puts food on your table, etc. A large portion of the world works mainly for this goal. It's perfectly valid to tell your grandchildren "I spent my life making sure we'd have enough money for you and your parents to live comfortably", and we shouldn't look down on someone who does that.
    [-]
    - Retric 26 minutes ago
      > HFT increases liquidity
      On average with a few caveats yes, but it also causes issues by reducing the vastly more important signaling aspects of the market.
      Trade volume for example can get almost completely divorced from economic reality.
    - amelius 1 hour ago
      > Also, it's not pointless to the person actually engaging in HFT if they are making money from it. Activities aren't pointless if they pay your rent, put food on your table, etc.
      You have to look at the bigger picture. HFT companies only increase the barrier to entry to participate in the market. This means that while you may be able to put food on the table, you make it more difficult for other people to put food on the table.
      [-]
      - OneDeuxTriSeiGo 1 hour ago
        Markets should not primarily be a casino.
        If you want to invest some of your income into a spread of companies? Go for it. But if all you are doing is trying to game the market? That's just gambling, go to a real casino and have some fun instead.
        HFT should be driven by these types of market makers exactly for the purpose of decreasing the amount of unnecessary speculation and volatility in the markets. HFTs killing the spreads is one of the only reasons the markets didn't implode during covid despite the extraordinary fear and volatility that was impacting them.
        You want a parasite? Look to hedge funds who just sit on piles of other people's cash to scoop obscene amounts off the top (relative to their actual contribution) for occassionally shuffling funds between financial instruments.
        [-]
        amelius 1 hour ago
        > HFTs killing the spreads is one of the only reasons the markets didn't implode during covid despite the extraordinary fear and volatility that was impacting them.
        And why would this not be possible if everybody traded at the same, low, frequency? Alternatively, what is the minimum trading frequency required to save markets from collapsing?
        By trying to answer these questions you see how ridiculous the situation with HFT really is.
      - jbstack 1 hour ago
        How's that? HFT lowers the barrier of entry for ordinary participants who just want to invest their savings either directly or through something like a pension fund, because they get all those benefits I mentioned, and those benefits are useful when you are only executing a trade once a month or year.
        It only increases the barrier to entry to people who want to do HFT themselves on a smaller scale, because they now can't compete on latency etc. But it's a bit of a circular argument to say that it's a bad thing that engaging in HFT increases the barrier of entry to HFT.
        In any case, and to reiterate, I wasn't refuting a claim that HFT is a bad thing, I was refuting your original claim that it's pointless. For something to be pointless, there has to be no purpose to it whatsoever, which means it has to have no benefits.
        [-]
        amelius 1 hour ago
        We all know that middlemen are bad.
        We don't need an HFT middleman.
  - KeplerBoy 4 minutes ago
    Meh, at least HFT is making sure markets are efficient.
    Now imagine telling your grandkids you worked on pushing short form content in absolutely every app to maximize engagement and rot brains?
  - le-mark 1 hour ago
    High frequency trading is far from the only pointless activity software and hardware developers are employed to do. At least in that realm they are pushing the state of the art, unlike presenting ads or yet another crud app.
    [-]
    - amelius 1 hour ago
      Pushing the state of the art? That's only meaningful if the results are published. But I suspect that everything is kept secret.
      Come on, it is not difficult to find better things to do with your brains. Like work in medicine, etc.
      [-]
      - OneDeuxTriSeiGo 1 hour ago
        You do realise that Jane Street (one of the big HFT market makers) publishes basically all their technical work and contributes to massive amounts of open source software? They are to my knowledge one of the main contributors to the OCaml ecosystem as well.
  - bombcar 33 minutes ago
    Will the grandchildren be able to hear you over the roar of the yacht?
- mungoman2 8 hours ago
  I agree! Seems like very interesting work.
- typpilol 10 hours ago
  I feel like it's also got to be super stressful
  Imagine getting a call you system is down and is costing them millions per hour
  [-]
  - zorked 3 hours ago
    The way to think is the opposite: your system is making millions per hour.
    [-]
    - ktm5j 2 hours ago
      What about when something goes wrong because you overlooked some detail?
      [-]
      - jacquesm 2 hours ago
        Shit happens. I wrote the firmware for a multi-link HDLC card which earned vast amounts of money for the company that commissioned it. I was quite aware of my responsibility and so was the hardware team. We stress tested the crap out of it before releasing to production and fortunately it never locked up when it was in actual use but we had some pretty wild and very rare (so hard to trigger and reproduce) bugs which delayed deployment considerably. But none of this gave me the kind of feeling that working on fuel estimation software for aircraft gave. That's when there are in the most literal sense lives on the line, and that's a completely different kind of pressure. You simply can not fuck up. It also really rammed home the value of code review and having a good specification so that you can ensure that within the envelope of input parameters your software does what it is supposed to do.
        [-]
        drmpeg 1 hour ago
        Multi-link HDLC. That brings me back to 1985 and the Intel 8274, Zilog Z8530 and the Western Digital WD2511 (that implemented the layer 2 protocol in silicon).
        [-]
        jacquesm 1 hour ago
        Yes, that was around that time. There were a couple of problems, the first batch of chips we got was very early in the development stage of the chip, pre-production issues and there were some bugs in the chips themselves which could cause lock-up under some circumstances. We found ways to work around those and then of course there were all of the niceties around dealing with a device that generates an extremely high rate of interrupts. So the code would either have to attempt to service all channels on any interrupt or be re-entrant. I don't remember which solution we picked but in the end the thing was, once we had the bugs worked out fairly bullet proof.
        One funny bit about the development process was that initially I was going to be nice and implement every layer as its own stand-alone bit of software communicating via a defined set of primitives with the layers above and below. But that was slow as molasses so in the end all of that elegance got discarded for an absolutely unholy sandwich that did all of the layers in a single chunk of code. But with the experience from the 'slow' version that was actually doable, I would have never been able to write that as the first implementation. Classic illustration of 'first make it work, then make it fast'.
        [-]
        drmpeg 35 minutes ago
        Cool. We had a small company in San Diego called Metacomp (long gone now) design a Z8530 add on module for their 80186 based Multibus board. It had two Z8530 chips on it and you could have two modules on the base board. So 8 channels total per slot.
        [-]
        jacquesm 20 minutes ago
        Yes, 8 was just about the maximum, simply because that would approach the limits of how many interrupts you could process before you'd inevitably start losing them. Also, space and power delivery limitations, and connector space on the backside made doing more than that very impractical. Or you had to take more than one interrupt per board but that was 'not done'. I've always loved working on that division line between hardware and software, as close to the metal as possible. Funny, I'd all but forgotten about this project, now I'm wondering if I still have the software somewhere, just searched for a bit and can't locate it, so there is a good chance I wrote it on company hardware. I did find another project that I'd forgotten about, a CAD program for sails for sailboats that I wrote in the 80's for TD Sails in the Netherlands. That was a great project to work on and I met some awesome people doing it.
    - hsbauauvhabzb 3 hours ago
      No the way to think is ‘it’s down less than our closest competitors system’
      [-]
  - toast0 9 hours ago
    On the other hand, no nights or weekends.
    [-]
    - noitpmeder 9 hours ago
      This is becoming less and less true every day. 24/7 markets are coming whether you want them or not, and if you're not trading outside of US Equity hours you're leaving significant money on the table.
      [-]
      - Galanwe 7 hours ago
        > This is becoming less and less true every day. 24/7 markets are coming whether you want them or not
        This has still to exist for real. Every couple of years there is a resurgence of "we should have 24/7 equity markets", followed by milestone announcements of Nasdaq.
        The reality is more... nuanced.
        First, there has to be a wide enough window for corporate actions and news dissemination to happen. Not only is that a regulatory requirement, but anyway no sane investor would like to trade in such a window even if possible.
        Second, there are already early sessions, pre opening sessions, and late sessions on most American markets. The liquidity there is almost inexistant, so I wouldn't say there is a huge demand for extended trading hours by non-retail participants.
        And last, half the liquidity of US markets is not on lit books, and half the remaining lit liquidity is close to the auction. I don't think it would even be a net profit for lit exchanges to extend their hours in these conditions. As for dark pool, I think only 2 or 3 of the smaller ones are 24/7, like the one from IB.
        > if you're not trading outside of US Equity hours you're leaving significant money on the table
        HFTs need a lot of liquidity and tight spreads, that's pretty much only doable at scale in the US and Japan. Europe and EM are either not liquid enough, or only really able to sustain low capital ad-hoc strategies.
      - dfex 8 hours ago
        Genuinely curious - would 24x7 low-latency trading matter then? Wouldn't after-hours trading (for your local timezone) be happening on an exchange in another location?
        Or do you mean dark pools/private exchanges that may run 24x7?
    - hamonrye 9 hours ago
      PoE is the way to use low voltage in electronics.
  - ta1243 6 hours ago
    I'd want to know why they had such a system as a single point of failure
    [-]
    - hshdhdhehd 3 hours ago
      Failure modes can be a bug making a bad trade while doing so at 100% uptime.
    - kakacik 4 hours ago
      No business is designed ideally. Now do you want the job or we move to another one who will accept this?
userbinator 8 hours ago
less than 60ns loopback latency
There are some Ethernet switches with 4ns latency, and those do more than just sending and receiving, so there's clearly still an order of magnitude of improvement still available. 4ns is basically ~40 cycles of the bit clock for 10G Ethernet.
[-]
- jacquesm 2 hours ago
  It's a bit more complicated than that. Loopback implies that the data went off the wire and was sent back. That puts some serious constraints on the process that deals with that data, how it is stored and how it is fed back into the system. You can't just compare that with naked pass-through latency (which is what a switch would do), because the signal is already on the way out before it has even fully been received.
  Essentially a few ns after the header is received it will already be passed on to the output port after some signal conditioning, which happens almost without latency. A typical 'high speed' switch will have 800 ns round trip, so that 60 ns quoted here is actually quite impressive.
- mgrosvenor 1 hour ago
  No they don’t. At least not on the critical path. L1 switch’s at 4ns simply replicate the incoming electrical signal. That’s all they do.
  They also do some packet accounting and cute features off the critical path. None of that at 4ns.
  [-]
  - foobar10000 27 minutes ago
    In fairness, assuming 4/8 ports in, 4 ports out, operating at some ungodly GHz using a custom GaAs or SiGe chip, and working in gearboxed scrambled space with very clever input mac prefix mapping and output scrambled/gearboxed precomputation, one _could_ do around 8ns optimistically from the start of the first 64b/66b block (the 0x55... preamble) for 10GB. There's some stuff about preamble shrink that makes it wonky, but that is doable. And interestingly enough, for 25G-R, one can comfortably do this in about 4ns. I am not in fact aware of such a beastie existing for 25G, but I have seen 1 or 2 for 10G though.
    Surprisingly, if ChatGPT is prompted _juust_ right, it will even give you a good way to do this.
  - rrr_eady 1 hour ago
    This guy knows a thing or two about these devices ;)
    [-]
    - foobar10000 55 minutes ago
      Yes he very much does :)
- darrin 2 hours ago
  There are L1 crosspoint "switches" in the 2-5ns range depending on port density and similar modes in some Ethernet silicon. These are not Ethernet switches in any normal sense though. They only replicate the signal 1:1 or 1:n and do not dynamically switch the destination based on anything in the packet. The fastest L3 Ethernet switches on the market are ~90ns.
- mrlongroots 6 hours ago
  The other funny bit is that one-way PCIe latency is 250ns-ish (don't quote me on the exact numbers), which imposes a hard 1us constraint on latency between two hosts.
  [-]
  - mgrosvenor 1 hour ago
    You can go quicker with CXL, but not by much.
    [-]
    - foobar10000 1 hour ago
      There are always DPUs like NextSilicon, NVidia's Blue*, etc - at under 100 ns of SoF to fast compute though.
- rrr_eady 4 hours ago
  For switching, this is just not true. There's in/out solutions that are at this kind of scale, but they are by definition not implementing any switching logic.
- rasz 6 hours ago
  Cut Through switching needs what, 14 bytes? Thats ~12ns on 10G? How would you do it in 40 cycles?
  [-]
  - Perz1val 6 hours ago
    Not my are of expertise, but afaik the neat part is that you don't do it in cycles. The "program" lives as the transistor layout, not as instructions
    [-]
    - wtallis 6 hours ago
      The cycles in question are not those of any processor, but those of the signal coming in the wire. No matter how it's implemented, a switch cannot react to bits that are still in the wire and haven't made it to the switch yet.
  - ongy 4 hours ago
    why do you assume the full ethernet header?
    IIRC. cut-through only needs the first 6 bytes. Since it only needs the destination address for the port lookup.
    Potentially the first bit, on broadcast.
    [-]
    - rasz 3 hours ago
      Because you might want VLAN. Plus you cant just start blasting reply without a preamble, so its still 14 bytes after receiving just the dest MAC. Then you get potential IFG, FEC, scrambling, it all adds up, no way any switch can do 4ns without heavy lawyer talk in the small print.
throwaway2037 7 hours ago
LinkedIn tells me: https://uk.linkedin.com/in/ttchisholm
```
    FPGA engineer with a focus on ultra-low latency networking at Jane Street.
```
Yikes.
[-]
- anthonj 7 hours ago
  Not sure i get the implication, something wrong with the company?
  [-]
  - exogenousdata 2 hours ago
    I’m not the parent, but I’d assume that this was an impressed ‘yikes.’ While Jane is not the most well known firm for ultra low latency trading, an FPGA engineer from that firm could be considered an expert practitioner in the field.
  - lloydatkinson 5 hours ago
    Some people just love dropping “yikes” type comments and then never explaining themselves.
- dev_hugepages 5 hours ago
  Is this really unexpected? A person who knows ultra-low-latency networking with FPGAs writes an article about making low-latency networking hardware for a company that needs high-speed, low-latency communications (high-frequency trading)