motd

Welcome to
           __
      (___()'`;  www.nerdondon.com
      /,    /`
      \\"--\\

^ its a huge gou. get it? think Chinese :).
..."gou" means dog in Chinese but also hugo for the blog plaform.

Anyway, the content will generally be techinical and about software engineering
but I might throw in some random topics here and there.

Linux dne.nerdondon.com 2.0.0-1-arm64
Last login: (lol how do you do this with hugo?)
      

[Notes]: What’s inside of TurboFan? by Benedikt Meurer

Prelude

Some notes from the What’s inside of TurboFan? by Benedikt Meurer talk at AgentConf 2018. Watch on YouTube here.

Actual Notes :D

Intro to TurboShaft

  • TurboFan is the optimizing compiler in the execution pipeline for the v8 JS engine.

  • Replaces Crankshaft–the old optimizing compiler–which couldn’t handle optimizations for new JS (ES2015+) features.

    • One interesting case of this is that Crankshaft couldn’t handle optimizing try-catch and ES2015 has a lot try-catch usage. For example, for..of loops have an implicit try..catch.
  • TurboShaft begins compilation from bytecode whereas Crankshaft compiles from JS source. This is an important difference because Ignition (the v8 interpreter) generates bytecode as part of its iterpretation process. This means that TurboShaft has a headstart and doesn’t have to re-parse the JS source as part of it’s compilation process.

  • TurboShaft takes the bytecode and generates a graph representation of it, which is then fed into the inlining and specialization phase. This phase will use runtime information from the interpreter to prune graph paths that are not used, thereby optimizing the code. This is the frontend of the compiler

  • The result is passed through an optimization phase where textbook optimizations (e.g. redundancy elimination, escape analysis) are made to the graph representation of the code.

  • The last phase (or backend) of the compiler performs machine-level optimizations and code generation. This phase is particular for the machine the code is running on (little redundant but worth repeating :D).

Inlining Specifics

  • Inline means to essentially take the body of a function and place it at the call site of said function.

  • Might not look like much on its own but, combined with other optimization passes (e.g. constant folding), it can have amazing results.

  • The compiler can also inline builtin JS functions (e.g. Array.prototype.every) and run further optimization passes on the library functions.

  • Builtin inlining is important because it helps developers write idiomatic JS.

Predictable Performance

  • Crankshaft had really crazy performance cliffs. This means that special cases that differed from an optimized code path would incur large and unpredictable penalties.

  • In Crankshaft, certain features (e.g. for..of, const usage, etc.), caused a function to be completely disabled for optimizations. TurboShaft removes this obstacle and allows functions containing these so called “optimization killers” to be optimized. It does not mean these features can be fully optimized, but at least the rest of the function is checked for optimization.

  • A caveat to keep in mind about optimizing JS code is that, in real world scenarios, there may be significantly more time spent doing layout and style calculations. Fully focusing on getting the most optimal JS compilation may not be the best way to spend time.

  • A key takeaway though is that when optimizations are made through TurboShaft there are significant performance increases.

Useful references

Read more →

[Notes]: Distributed Sagas: A Protocol for Coordinating Microservices

Prelude

Some notes from the Distributed Sagas: A Protocol for Coordinating Microservices talk given by Caitie McAffery at .NET Fringe 2017. Watch on YouTube here.

Actual Notes :D

Pretext

  • With the evolution of service architectures toward a microservices based architecture comes the need to maintain database integrity. One class of solutions for this is termed feral concurrency control by Peter Bailis in 2015 paper.

    • Side note: feral is defined as wild and uncultivated.
  • Feral concurrency control refers to “application-level mechanisms for mainting database integrity” (emphasis my own). This means that the database is no longer the source of truth that enforces consistency and concurrency guarantees. This code is now spread throughout the services that constitute the application.

  • This is an issue for code maintenance and puts more burden on application developers that should really be abstracted away from individual services.

  • Example given in talk is a trip planning application. Can book hotels, rental cars, and flights. There is also a payment system.

  • Example flow with feral concurrency control. Front-end service sends request to hotels service which has code that determines that the next step is the payments service. The hotel service also determines what happens when a payment fails.

  • The key here is that causal relationships–the chain of events that need to occur–are written into the services.

  • You can imagine a trips service being added that utilizes all of the other services–the hotel, car rental, flight, and payments services. There would be a ton of feral concurrency mechanisms built into the service in order to deal with failures in its dependencies.

  • As these systems evolve you can end up with a mesh of service relationships i.e. the death star architecture. These become hard to maintain because you don’t know what caual relationships are encoded into which services. Therefore, you don’t have observabiliity into the architecture.

Solved problem?

  • Frowns on Google’s Spanner project because it promises all of both consistency and availibility in the CAP theorem, but this is only true because they modify the meaning of availibility to 5 9’s availibility.

  • Strong consistency between all services may not be the boon it is made out to be because–as hypothesized by a Facebook paper–service latencies may increase. This latency increase is posited to be due to all services being bound to the slowest machine in a cluster.

  • So the strongest form of consistency may not be desirable due to the latency it adds back into the system and users may be more disturbed by such latency increases than minor consistency issues.

  • Two phase commit (2PC) is another way to keep atomicity across distributed transactions.

  • A prepare phase where services reply to a coordinator if they can service the request. A commit phase where the coordinator tells services to commit/abort. Finally, services will reply with done message.

  • 2PC isn’t used in industry because it doesn’t scale well. There may be O(N^2) messages in the worst case and the coordinator presents a single point of failure

Distributed sagas

  • Sagas were originally proposed in a 1987 paper for usage in a single database for long lived transactions

  • “A Saga is a Long Lived Transaction that can be written as a sequence of transactions that can be interleaved. All transactions in the sequence complete successfully or compensating transactions are ran to amend a partial execution.”

  • A distributed saga is defined in the talk to be a “collection of requests and compensating requests that represent a single business level action”

    • The requests can be booking a flight or and the compensating transaction can be canceling that flight
  • Invariants that we want in our requests:

    1. Can abort
    2. Idempotent
  • Invariants that we want for our compensating requests:

    1. Semantically undoes the effect of a request. Doesn’t need to rollback to the same exact state e.g. charge a credit and then issue a refund. May see two line items on a statement but the state is semantically the same.
    2. Cannot abort i.e. must always succeed
    3. Idempotent so we can re-run until success
    4. Commutative because timeouts may cause retries of the same request that we are trying to compensate for
  • Distributed saga guarantee: “All requests were completed successfully or a subset of requests and the corresponding compensating requests were executed”

  • No guarantee of either atomicity or isolation of requests

  • Can model as a DAG and inherits nice properties of DAGs like concurrent execution of certain constituent sagas

  • Need a distributed saga log in order to maintain fault tolerance and high availability

  • Also need a saga execution coordinator (SEC) to execute the saga DAG. The SEC differs from the coordinator in 2PC in that there can be multiple SEC’s.

  • To do rollback, flip the edges of the DAG. This keeps causal relationships intact e.g if there is a payment for a flight then there was a flight booked.

  • The SEC is a key component because all concurrency control is stored within it and not spread out the codebase i.e. not feral.

  • Makes service composition easy. You can imagine having a saga for car rentals like book rental car -> pay for rental and analogous sagas for booking a hotel and booking a flight. A big trips saga can then just pull the independent sagas into a DAG of its own.

Useful references

Read more →

[Notes]: The JavaScript Event Loop

Prelude

This is gonna be a long one lol…There’s a lot to know about the event loop.

Actual Notes :D

  • setTimeout(ms, callback) doesn’t work how you would intuitively expect it to. The JavaScript main thread doesn’t just yield back to the setTimeout callback exactly after the specified ms. It must wait for a “turn” of the event loop in addition to the specified timeout. So really, you should think of the timeout you specify as the minimum time that setTimeout will sleep for.

  • After the timeout, the callback you specify is queued as a task to be executed by the main thread. It is common to hear that JavaScript is single-threaded, but this is not entirely accurate. Asynchronous tasks are still run in parallel on separate threads so that the main thread is not blocked from doing other things like rendering. The single-thread notion just comes from the concurrency model utilizing a main thread as a means to process results from the async threads.

  • The JS specification allows for browsers to implement multiple queues for different things. This allows browser implementers to have some leeway in how they implement the spec. In general, you can conceptualize that there will be a (1) main task queue, (2) a micro-task queue, and (3) an animation callback queue. Microtasks and animation callbacks are described below.

  • Take the following code block as an example:

    setTimeout(1000, cb1);
    setTimeout(1000, cb2);
    
    • The event loop will proceess the callbacks like so:

      1. Execute the statements synchronously in one “turn” of the event loop
      2. Callbacks are queued in the task queue after 1000 ms
      3. cb1 is executed in the current turn
      4. cb2 is executed in the subsequent turn
    • The key takeaway is that tasks are executed one at a time, i.e. one task per turn of the event loop

requestAnimationFrame

  • So the previous is a summary of the task queue portion of the event loop…a separate part is the rendering logic

  • When you want to run code with the render steps, you would use a requestAnimationFrame callback. Conceptually, the render steps are on the complete other side of the task queues. Visually…see below screenshot from Jake Archibald’s talk):

    Event loop screen capture from Jake Archibald’s talk The Event Loop.
  • So far what we have is, on a turn of the event loop where a render is necessary:

    1. rAF callbacks are executed right at the start of the frame
    2. Layout and paint is performed
    3. A task from the task queue will be processed
  • where a render is necessary” is an important note because the browser doesn’t need to render on every turn of the event loop. Many turns can go by before a render is deemed necessary.

    • Renders are only necessary when there are actual style and layout changes. Render only needs to happen when the monitor refreshes–often, 60fps. There won’t be any rendering if the browser tab is not visible.
  • As of this writing (3-15-2018), Safari and Edge(?) put requestAnimationFrame after style and layout occur which violates the spec.

Compositor thread

  • Though not in spec, most browsers will offload the actual rendering of pixels to what is known as the compositor thread. Pixel data is computed during the paint step and then it is sent to the compositor thread as a texture to actually be written onto the screen.

  • “The compositor can only take already drawn bitmaps and move them around” - Jake Archibald

  • This also means that certain CSS properties (e.g. transform) can be entirely offloaded to the compositor thread and not be affected by what happens on event loop. (Stay tuned–notes on this coming right up).

Microtasks

  • Used for promise callbacks, but were originally designed to be used for MutationObservers.

  • Microtasks happen at the end of a task or when the JavaScript stack goes from non-empty to empty. This means that microtasks can run during a task or during rAF’s.

  • Because microtasks are run after the stack empties, you are guaranteed that your promise callback is not running while other JavaScript is running.

  • Taking a look at how the task queues empty:

    • The task queue is processed one per turn of the event loop and new tasks queued by the current task go to the end wait for the event loop.

    • During render, the animation task queue will be emptied except for any new callbacks that get scheduled.

    • When the JS stack empties, all microtasks in the queue will be processed as well as any additional microtasks that are added by pre-existing microtasks. This means that you can have an infinite loop of microtasks.

Useful references

Read more →

[Notes] rIC (requestIdleCallback) rolled

Prelude

…I got nothing.

Actual Notes :D

  • requestIdleCallback allows you to schedule actions during any remaining time at the end of a frame or when a user is inactive.

  • This is different from requestAnimationFrame in that rAF occurs before painting.

  • The sequence of events is as follows:

    1. execute JS from task queue
    2. rAF
    3. style calculations, layout, and paint (collectively called a frame commit)
    4. requestIdleCallback (or anywhere between the first 3 steps if the user is idle)
  • You cannot tell how much time the first 3 steps will take up, so there is no guarantee that there will be any remaining time to allocate to your requestIdleCallback’s. This means that only non-essential work should be executed using requestIdleCallback.

  • To ensure that this non-essential work gets done eventually, you can set a timeout when calling requestIdleCallback. When this timeout is reached, the browser WILL execute the callback passed to requestIdleCallback.

  • It is important to not apply any DOM changes when using requestIdleCallback because the callback may be executed at the end of frame, i.e. after the frame has been committed. This means that if there are any layout reads in the next frame (e.g. getBoundingClientRect), the layout needs to be recalculated via a forced synchronous layout. When you need to make a DOM update, schedule the update via a requestAnimationFrame callback. This also ensures that the time needed for the DOM changes doesn’t cross the browser deadline prescribed to the requestIdleCallback.

  • What is a forced synchronous layout? This is an extra layout operation that is caused when there are DOM changes from JavaScript and those changes are subsequently read. The extra layout operation is necessary because the DOM changes from JS invalidate the layout calculation from the previous frame.

Useful references

Read more →

[Notes]: Riffing on rAF (requestAnimationFrame)

Prelude

Sooo…three years later and I’m finally getting around to posting something. I’ve always kept a strong pipeline of articles to read and conference talks to watch, but I eventually end up forgetting all the cool things I learn. I didn’t really take any notes or, if I did, they were always in my trusty moleskine which wasn’t really too searchable ): . Thus came the idea to use the blog as my notebook :D, so here we are…

DISCLAIMER: These notes are meant to be short and as jump off points for deeper refreshers when necessary

Actual notes :D

  • requestAnimationFrame allows you to schedule actions (e.g. animation) to occur with the next layout/paint cycle of the browser event loop

  • Animations scheduled using a timer (i.e. setTimeout), should instead switch to using requestAnimationFrame

  • requestAnimationFrame allows you to sync your animations with the refresh rate of the client’s monitor, thereby eliminating extra work that may be done or missed when using setTimeout

  • setTimeout may be doing mistimed work because its callback is executed as part of the browser event loop’s task queue. The task queue gets processed after a frame paint and after the javascript execution stack is empty. Since you can’t tell how long it would take for the stack to empty, you cannot be certain that your work scheduled with setTimeout will be executed within the current frame. Put another way, you cannot be certain that your animation will execute before the next paint. This skip in animation work can cause jank for your users.

  • If the interval you set for setTimeout is less than the refresh rate of a user’s monitor, you may be doing more work than a user’s monitor can actually display to them

  • requestAnimationFrame is also battery friendly because it doesn’t run unless its browser window/tab is active.

  • A use case for requestAnimationFrame are scroll/resize related animations, e.g. parallax effects. rAF helps avoid jank if you are doing expensive DOM mutations in your scroll event handlers

Useful references

Read more →

Hi :D

Welcome to Will’s blog :D. Hopefully, I’ll post regularly. Like the bio says, I’ll post things primarily software development and maybe some quant and trading algo stuff.

Read more →