I Use This!
Very High Activity

News

Analyzed about 20 hours ago. based on code collected 2 days ago.
Posted over 3 years ago by TWiR Contributors
Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust ... [More] or send us a pull request. Want to get involved? We love contributions. This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR. Updates from Rust Community No Rust Blog posts this week. Newsletters RiB Newsletter #17 - Trick? Or Trait? This month in Dimforge #2 (October 2020) These Weeks in Actix | Sep-Oct '20 Tooling Rust-Analyzer Changelog #49 IntelliJ Rust Changelog #134 IntelliJ Rust: New Functionality for Cargo Features Observations/Thoughts Semantic FFI Bindings in Rust - Reactivating the Borrow Checker Exception safety in Rust: using transient droppers to prevent memory leaks Wasmcloud Progress Fast programming languages: C, C++, Rust, and Assembly For Complex Applications, Rust is as Productive as Kotlin Rust for Data-Intensive Computation Using Rust for a simple hardware project The Fatal Flaw of Ownership Semantics Fixing bootstrap of rustc using cg_clif Advanced Cargo [features] Usage Rust Walkthroughs Rust Design-for-Testability: a survey Rust from a Gopher - Lessons 3 & 4 Rocket Tutorial 01: Basics Building an AWS Lambda extension with Rust A Gopher Client in Rust - 02 Core Client A Gopher Client in Rust - 03 Bookmarks and Full Code Rust HTTP Testing with httpmock The Newtype Pattern in Rust How to: Rust + SDL2 + OpenGL on the web Minicompiler: Lexing Continuous Deployment For Rust Applications (Zero To Production In Rust #5) [DE] The Rust Programming Language (translated in German) [video] (Live Coding) Audio adventures in Rust: UI with WASM, Yew, and WebView [video] How to build a multiplayer game - RustFest.Global Pre-Event (Video) [video] Current state of wasm with rust using an example [video] Understanding Rust Lifetimes Project Updates oso, an open-source policy engine for authorization written in Rust, released version 0.7.1 of their authorization library for Rust projects! Apache Arrow 2.0.0 Rust Highlights Miscellaneous One Hundred Rust Binaries Why Dark didn't choose Rust Rust GameDev Ecosystem Survey Crate of the Week This week's crate is tract from Sonos, a neural network inference library, written purely in Rust for models in ONNX, NNEF and TF formats. Thanks to Benjamin Minixhofer for the suggestion! Submit your suggestions and votes for next week! Call for Participation Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started! Some of these tasks may also have mentors available, visit the task page for more information. If you are a Rust project owner and are looking for contributors, please submit tasks here. Updates from Rust Core 374 pull requests were merged in the last week add cg_clif as optional codegen backend (Woohoo!) rustc_span: improve bounds checks in byte_pos_to_line_and_col adjust turbofish help message for const generics avoid complex diagnostics in snippets which contain newlines suggest calling await on method call and field access fix control flow check for breaking with diverging values uplift temporary-cstring-as-ptr lint from clippy into rustc check object safety of generic constants chalk: make max goal size for recursive solver configurable coherence check perf: iterate over the smaller list optimise align_offset for stride=1 further inline NonZeroN::from(n) inline Default::default() for atomics inline some functions in core::str prevent String::retain from creating non-utf8 strings when abusing panic add fetch_update methods to AtomicBool and AtomicPtr add [T]::as_chunks(_mut) fix Box::into_unique hashbrown: better branch likelyness on stable futures: add WeakShared cargo: add a future-compatibility warning on allowed feature name characters cargo: new namespaced features implementation Rust Compiler Performance Triage 2020-11-03: 0 Regressions, 5 Improvements, 0 mixed A number of improvements on various benchmarks. The most notable news this week in compiler performance is the progress on instruction metric collection on a per-query level; see measureme#143 for the latest. Otherwise, this week was an excellent one for performance (though mostly on stress tests and auto-generated test cases rather than commonly seen code). See the full report for more. Approved RFCs Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week: No RFCs were approved this week. Final Comment Period Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now. RFCs RFC: Target extension Tracking Issues & PRs [disposition: merge] consider assignments of union field of ManuallyDrop type safe [disposition: merge] repr(transparent) on generic type skips "exactly one non-zero-sized field" check [disposition: merge] Rename/Deprecate LayoutErr in favor of LayoutError [disposition: merge] Tracking Issue for raw_ref_macros [disposition: merge] Add checking for no_mangle to unsafe_code lint New RFCs Checking conditional compilation at compile time Upcoming Events Online November 7 & 8, Global, RustFest Global November 10, Seattle, WA, US - Seattle Rust Meetup November 10, Saarbücken, Saarland, DE - Meetup: 5u16 (virtual) - Rust Saar November 12, Berlin, DE - Rust Hack and Learn - Berline.rs November 12, Washington, DC, US - Mid-month Rustful—How oso built a runtime reflection system for Rust - Rust DC November 12, Lehi, UT, US - WASM, Rust, and the State of Async/Await - Utah Rust If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access. Rust Jobs Software Engineer (IoT/Robotics) at Wayve (London, UK) Software Engineer at ChainSafe Systems (Toronto, Remote) Senior Software Engineer - Rust at Immunant (Remote US) Backend Engineer - Rust at Kraken (Remote NA, SA, EMEA) Backend Engineer, Kraken Futures - Rust at Kraken (Remote) Rust Engineer, Desktop GUI - Cryptowatch at Kraken (Remote) Senior Backend Engineer - Rust at Kraken (Remote NA, SA, EMEA) Senior Full Stack Engineer - Rust at Kraken (Remote) Software Engineer - Trading Technology (Rust) at Kraken (Remote NA, SA, EMEA) Tweet us at @ThisWeekInRust to get your job offers listed here! Quote of the Week Like other languages Rust does have footguns. The difference is that we keep ours locked up in the unsafe. – Ted Mielczarek on twitter Thanks to Nikolai Vazquez for the suggestion. Please submit quotes and vote for next week! This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman. Discuss on r/rust [Less]
Posted over 3 years ago by Gregory Mierzwinski
After over twenty years, Mozilla is still going strong. But over that amount of time, there’s bound to be changes in responsibilities. This brings unique challenges with it to test maintenance when original creators leave and knowledge of the ... [More] purposes, and inner workings of a test possibly disappears. This is especially true when it comes to performance testing. Our first performance testing framework is Talos, which was built in 2007. It’s a fantastic tool that is still used today for performance testing very specific aspects of Firefox. We currently have 45 different performance tests in Talos, and all of those together produce as many as 462 metrics. Having said that, maintaining the tests themselves is a challenge because some of the people who originally built them are no longer around. In these tests, the last person who touched the code, and who is still around, usually becomes the maintainer of these tests. But with a lack of documentation on the tests themselves, this becomes a difficult task when you consider the possibility of a modification causing a change in what is being measured, and moving away from its original purpose. Over time, we’ve built another performance testing framework called Raptor which is primarily used for page load testing (e.g. measuring first paint, and first contentful paint). This framework is much simpler to maintain and keep up with its purpose but the settings used for the tests change often enough that it becomes easy to forget how we set up the test, or what pages are being tested exactly. We have a couple other frameworks too, with the newest one (which is still in development) being MozPerftest – there might be a blog post on this in the future. With this many frameworks and tests, it’s easy to see how test maintenance over the long term can turn into a bit of a mess when it’s left unchecked. To overcome this issue, we decided to implement a tool to dynamically document all of our existing performance tests in a single interface while also being able to prevent new tests from being added without proper documentation or, at the least, an acknowledgement of the existence of the test. We called this tool PerfDocs. Currently, we use PerfDocs to document tests in Raptor and MozPerftest (with Talos in the plans for the future). At the moment in Raptor, we only document the tests that we have, along with the pages that are being tested. Given that Raptor is a simple framework with the main purpose being to measure page loads, this documentation gives us enough without getting overly complex. However, we do plan to add much more information to it in the future (e.g. what branches the tests run on, what are the test settings). The PerfDocs integration with MozPerftest is far more interesting though and you can find it here. In MozPerftest, all tests have a mandatory requirement of having metadata in the test itself. For example, here’s a test we have for measuring the start-up time on our mobile browsers which describes things such as the browsers that it runs on, and even the owner of the test. This lets us force the test writer to think about maintainability as we move into the future rather than simply writing it and forgetting it. For that Android VIEW test, you can find the generated documentation here. If you look through the documented tests that we have, you’ll notice that we also don’t have a single person listed as a maintainer. Instead, we refer to the team that built it as the maintainer. Furthermore, the tests actually exist in the folders (or code) that those teams are responsible for so they don’t need to exist in the frameworks folder giving us more accountability for their maintenance. By building tests this way, with documentation in mind, we can ensure that as time goes on, we won’t lose information about what is being tested, its purpose, along with who should be responsible for maintaining it. Lastly, as I alluded to above, outside of generating documentation dynamically we also ensure that any new tests are properly documented before they are added into mozilla-central. This is done for both Raptor and MozPerftest through a tool called review-bot which runs tests on submitted patches in Phabricator (the code review tool that we use). When a patch is submitted, PerfDocs will run to make sure that (1) all the tests that were documented actually exist, and (2) all the tests which exist are actually documented. This way, we can prevent our documentation from becoming outdated with tests that don’t exist anymore, and that all tests are always documented in some way. The review-bot left a complaint on this patch which was adding new suites. This one is because we could not find the actual tests. In the future, we hope to be able to expand this tool and its features from our performance tests to the massive box of functional tests that we have. If you think having 462 metrics to track is a lot, consider the thousands of tests we have for ensuring that Firefox functionality is properly tested. This project started in Q4 of 2019, with myself [:sparky], and Alexandru Ionescu [:alexandrui] building up the base of this tool. Then, in early H1-2020, Myeongjun Go [:myeongjun], a fantastic volunteer contributor, began hacking on this project and brought us from lightly documenting Raptor tests to having links to the tested pages in it, and even integrating PerfDocs into MozPerftest. If you have any questions, feel free to reach out to us on Riot in #perftest. [Less]
Posted over 3 years ago by Daniel Stenberg
HTTP Strict Transport Security (HSTS) is a standard HTTP response header for sites to tell the client that for a specified period of time into the future, that host is not to be accessed with plain HTTP but only using HTTPS. Documented in RFC 6797 ... [More] from 2012. The idea is of course to reduce the risk for man-in-the-middle attacks when the server resources might be accessible via both HTTP and HTTPS, perhaps due to legacy or just as an upgrade path. Every access to the HTTP version is then a risk that you get back tampered content. Browsers preload These headers have been supported by the popular browsers for years already, and they also have a system setup for preloading a set of sites. Sites that exist in their preload list then never get accessed over HTTP since they know of their HSTS state already when the browser is fired up for the first time. The entire .dev top-level domain is even in that preload list so you can in fact never access a web site on that top-level domain over HTTP with the major browsers. With the curl tool Starting in curl 7.74.0, curl has experimental support for HSTS. Experimental means it isn’t enabled by default and we discourage use of it in production. (Scheduled to be released in December 2020.) You instruct curl to understand HSTS and to load/save a cache with HSTS information using --hsts . The HSTS cache saved into that file is then updated on exit and if you do repeated invokes with the same cache file, it will effectively avoid clear text HTTP accesses for as long as the HSTS headers tell it. I envision that users will simply use a small hsts cache file for specific use cases rather than anyone ever really want to have or use a “complete” preload list of domains such as the one the browsers use, as that’s a huge list of sites and for most use cases just completely unnecessary to load and handle. With libcurl Possibly, this feature is more useful and appreciated by applications that use libcurl for HTTP(S) transfers. With libcurl the application can set a file name to use for loading and saving the cache but it also gets some added options for more flexibility and powers. Here’s a quick overview: CURLOPT_HSTS – lets you set a file name to read/write the HSTS cache from/to. CURLOPT_HSTS_CTRL – enable HSTS functionality for this transfer CURLOPT_HSTSREADFUNCTION – this callback gets called by libcurl when it is about to start a transfer and lets the application preload HSTS entries – as if they had been read over the wire and been added to the cache. CURLOPT_HSTSWRITEFUNCTION – this callback gets called repeatedly when libcurl flushes its in-memory cache and allows the application to save the cache somewhere and similar things. Feedback? I trust you understand that I’m very very keen on getting feedback on how this works, on the API and your use cases. Both negative and positive. Whatever your thoughts are really! [Less]
Posted over 3 years ago by Beatriz Rizental
(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) In a ... [More] previous TWiG blog post, I talked about my experiment on trying to compile glean-core to Wasm. The motivation for that experiment was the then upcoming Glean.js workweek, where some of us were going to take a pass at building a proof-of-concept implementation of Glean in Javascript. That blog post ends on the following note: My conclusion is that although we can compile glean-core to Wasm, it doesn’t mean that we should do that. The advantages of having a single source of truth for the Glean SDK are very enticing, but at the moment it would be more practical to rewrite something specific for the web. When I wrote that post, we hadn’t gone through the Glean.js workweek and were not sure yet if it would be viable to pursue a new implementation of Glean in Javascript. I am not going to keep up the suspense though. We were able to implement a proof of concept version of Glean that works in Javascript environments during that workweek, it: Persisted data throughout application runs (e.g. client_id); Allowed for recording event metrics; Sent Glean schema compliant pings to the pipeline. And all of this, we were able to make work on: Static websites; Svelte apps; Node.js servers; Electron apps; Node.js command like applications; Node.js server applications; Qt/QML apps. Check out the code for this project on: https://github.com/brizental/gleanjs The outcome of the workweek confirmed it was possible and worth it to go ahead with Glean.js. For the past weeks the Glean SDK team has officially started working on the roadmap for this project’s MVP. Our plan is to have an MVP of Glean.js that can be used on webextensions by February/2021. The reason for our initial focus on webextensions is that the Ion project has volunteered to be Glean.js’ first consumer. Support for static websites and Qt/QML apps will follow. Other consumers such as Node.js servers and CLIs are not part of the initial roadmap. Although we learned a lot by building the POC, we were probably left with more open questions than answered ones. The Javascript environment is a very special one and when we set out to build something that can work virtually anywhere that runs Javascript, we were embarking on an adventure. Each Javascript environment has different resources the developer can interact with. Let’s think, for example, about persistence solutions: on web browsers we can use localStorage or IndexedDB, but on Node.js servers / CLIs we would need to go another way completely and use Level DB or some other external library. What is the best way to deal with this and what exactly are the differences between environments? The issue of having different resources is not even the most challenging one. Glean defines internal metrics and their lifetimes, and internal pings and their schedules. This is important so that our users can do base analysis without having any custom metrics or pings. The hardest open question we were left with was: what pings should Glean.js send out of the box and what should their scheduling look like? Because Glean.js opens up possibilities for such varied consumers: from websites to CLIs, defining scheduling that will be universal for all of its consumers is probably not even possible. If we decide to tackle these questions for each environment separately, we are still facing tricky consumers and consumers that we are not used to, such as websites and web extensions. Websites specifically come with many questions: how can we guarantee client side data persistence, if a user can easily delete all of it by running some code in the console or tweaking browser settings. What is the best scheduling for pings, if each website can have so many different usage lifecycles? We are excited to tackle these and many other challenges in the coming months. Development of the roadmap can be followed on Bug 1670910. [Less]
Posted over 3 years ago by Nicholas Nethercote
I last wrote in December 2019 about my work on speeding up the Rust compiler. Time for another update. Incremental compilation I started the year by profiling incremental compilation and making several improvements there. #68914: Incremental ... [More] compilation pushes a great deal of data through a hash function, called SipHasher128, to determine what code has changed since the last compiler invocation. This PR greatly improved the extraction of bytes from the input byte stream (with a lot of back and forth to ensure it worked on both big-endian and little-endian platforms), giving incremental compilation speed-ups of up to 13% across many benchmarks. It also added a lot more comments to explain what is going on in that code, and removed multiple uses of unsafe. #69332: This PR reverted the part of #68914 that changed the u8to64_le function in a way that made it simpler but slower. This didn’t have much impact on performance because it’s not a hot function, but I’m glad I caught it in case it gets used more in the future. I also added some explanatory comments so nobody else will make the same mistake I did! #69050: LEB128 encoding is used extensively within Rust crate metadata. Michael Woerister had previously sped up encoding and decoding in #46919, but there was some fat left. This PR carefully minimized the number of operations in the encoding and decoding loops, almost doubling their speed, and giving wins on many benchmarks of up to 5%. It also removed one use of unsafe. In the PR I wrote a detailed description of the approach I took, covering how I found the potential improvement via profiling, the 18 different things I tried (10 of which improved speed), and the final performance results. LLVM bitcode Last year I noticed from profiles that rustc spends some time compressing the LLVM bitcode it produces, especially for debug builds. I tried changing it to not compress the bitcode, and that gave some small speed-ups, but also increased the size of compiled artifacts on disk significantly. Then Alex Crichton told me something important: the compiler always produces both object code and bitcode for crates. The object code is used when compiling normally, and the bitcode is used when compiling with link-time optimization (LTO), which is rare. A user is only ever doing one or the other, so producing both kinds of code is typically a waste of time and disk space. In #66598 I tried a simple fix for this: add a new flag to rustc that tells it to omit the LLVM bitcode. Cargo could then use this flag whenever LTO wasn’t being used. After some discussion we decided it was too simplistic, and filed issue #66961 for a more extensive change. That involved getting rid of the use of compressed bitcode by instead storing uncompressed bitcode in a section in the object code (a standard format used by clang), and introducing the flag for Cargo to use to disable the production of bitcode. The part of rustc that deals with all this was messy. The compiler can produce many different kinds of output: assembly code, object code, LLVM IR, and LLVM bitcode in a couple of possible formats. Some of these outputs are dependent on other outputs, and the choices on what to produce depend on various command line options, as well as details of the particular target platform. The internal state used to track output production relied on many boolean values, and various nonsensical combinations of these boolean values were possible. When faced with messy code that I need to understand, my standard approach is to start refactoring. I wrote #70289, #70345, and #70384 to clean up code generation, #70297, #70729 , and #71374 to clean up command-line option handling, and #70644 to clean up module configuration. Those changes gave me some familiarity with the code, simplifed it, and I was then able to write #70458 which did the main change. Meanwhile, Alex Crichton wrote the Cargo support for the new -Cembed-bitcode=no option (and also answered a lot of my questions). Then I fixed rustc-perf so it would use the correct revisions of rustc and Cargo together, without which the the change would erroneously look like a performance regression on CI. Then we went through a full compiler-team approval and final comment period for the new command-line option, and it was ready to land. Unfortunately, while running the pre-landing tests we discovered that some linkers can’t handle having bitcode in the special section. This problem was only discovered at the last minute because only then are all tests run on all platforms. Oh dear, time for plan B. I ended up writing #71323 which went back to the original, simple approach, with a flag called -Cbitcode-in-rlib=no. [EDIT: note that libstd is still compiled with -Cbitcode-in-rlib=yes, which means that libstd rlibs will still work with both LTO and non-LTO builds.] The end result was one of the bigger performance improvements I have worked on. For debug builds we saw wins on a wide range of benchmarks of up to 18%, and for opt builds we saw wins of up to 4%. The size of rlibs on disk has also shrunk by roughly 15-20%. Thanks to Alex for all the help he gave me on this! Anybody who invokes rustc directly instead of using Cargo might want to use -Cbitcode-in-rlib=no to get the improvements. [EDIT (May 7, 2020): Alex subsequently got the bitcode-in-object-code-section approach working in #71528 by adding the appropriate “ignore this section, linker” incantations to the generated code. He then changed the option name back to the original -Cembed-bitcode=no in #71716. Thanks again, Alex!] Miscellaneous improvements #67079: Last year in #64545 I introduced a variant of the shallow_resolved function that was specialized for a hot calling pattern. This PR specialized that function some more, winning up to 2% on a couple of benchmarks. #67340: This PR shrunk the size of the Nonterminal type from 240 bytes to 40 bytes, reducing the number of memcpy calls (because memcpy is used to copy values larger than 128 bytes), giving wins on a few benchmarks of up to 2%. #68694: InferCtxt is a type that contained seven different data structures within RefCells. Several hot operations would borrow most or all of the RefCells, one after the other. This PR grouped the seven data structures together under a single RefCell in order to reduce the number of borrows performed, for wins of up to 5%. #68790: This PR made a couple of small improvements to the merge_from_succ function, giving 1% wins on a couple of benchmarks. #68848: The compiler’s macro parsing code had a loop that instantiated a large, complex value (of type Parser) on each iteration, but most of those iterations did not modify the value. This PR changed the code so it initializes a single Parser value outside the loop and then uses Cow to avoid cloning it except for the modifying iterations, speeding up the html5ever benchmark by up to 15%. (An aside: I have used Cow several times, and while the concept is straightforward I find the details hard to remember. I have to re-read the documentation each time. Getting the code to work is always fiddly, and I’m never confident I will get it to compile successfully… but once I do it works flawlessly.) #69256: This PR marked with #[inline] some small hot functions relating to metadata reading and writing, for 1-5% improvements across a number of benchmarks. #70837: There is a function called find_library_crate that does exactly what its name suggests. It did a lot of repetitive prefix and suffix matching on file names stored as PathBufs. The matching was slow, involving lots of re-parsing of paths within PathBuf methods, because PathBuf isn’t really designed for this kind of thing. This PR pre-emptively extracted the names of the relevant files as strings and stored them alongside the PathBufs, and changed the matching to use those strings instead, giving wins on various benchmarks of up to 3%. #70876: Cache::predecessors is an oft-called function that produces a vector of vectors, and the inner vectors are usually small. This PR changed the inner vector to a SmallVec for some very small wins of up to 0.5% on various benchmarks. Other stuff I added support to rustc-perf for the compiler’s self-profiler. This gives us one more profiling tool to use on the benchmark suite on local machines. I found that using LLD as the linker when building rustc itself reduced the time taken for linking from about 93 seconds to about 41 seconds. (On my Linux machine I do this by preceding the build command with RUSTFLAGS="-C link-arg=-fuse-ld=lld".) LLD is a really fast linker! #39915 is the three-year old issue open for making LLD the default linker for rustc, but unfortunately it has stalled. Alexis Beingessner wrote a nice summary of the current situation. If anyone with knowledge of linkers wants to work on that issue, it could be a huge win for many Rust users. Failures Not everything I tried worked. Here are some notable failures. #69152: As mentioned above, #68914 greatly improved SipHasher128, the hash function used by incremental compilation. That hash function is a 128-bit version of the default 64-bit hash function used by Rust hash tables. I tried porting those same improvements to the default hasher. The goal was not to improve rustc’s speed, because it uses FxHasher instead of default hashing, but to improve the speed of all Rust programs that do use default hashing. Unfortunately, this caused some compile-time regressions for complex reasons discussed in detail in the PR, and so I abandoned it. I did manage to remove some dead code in the default hasher in #69471, though. #69153: While working on #69152, I tried switching from FxHasher back to the improved default hasher (i.e. the one that ended up not landing) for all hash tables within rustc. The results were terrible; every single benchmark regressed! The smallest regression was 4%, the largest was 85%. This demonstrates (a) how heavily rustc uses hash tables, and (b) how much faster FxHasher is than the default hasher when working with small keys. I tried using ahash for all hash tables within rustc. It is advertised as being as fast as FxHasher but higher quality. I found it made rustc a tiny bit slower. Also, ahash is also not deterministic across different builds, because it uses const_random! when initializing hasher state. This could cause extra noise in perf runs, which would be bad. (Edit: It would also prevent reproducible builds, which would also be bad.) I tried changing the SipHasher128 function used for incremental compilation from the Sip24 algorithm to the faster but lower-quality Sip13 algorithm. I got wins of up to 3%, but wasn’t confident about the safety of the change and so didn’t pursue it further. #69157: Some follow-up measurements after #69050 suggested that its changes to LEB128 decoding were not as clear a win as they first appeared. (The improvements to encoding were still definitive.) The performance of decoding appears to be sensitive to non-local changes, perhaps due to differences in how the decoding functions are inlined throughout the compiler. This PR reverted some of the changes from #69050 because my initial follow-up measurements suggested they might have been pessimizations. But then several sets of additional follow-up measurements taken after rebasing multiple times suggested that the reversions sometimes regressed performance. The reversions also made the code uglier, so I abandoned this PR. #66405: Each obligation held by ObligationForest can be in one of several states, and transitions between those states occur at various points. This PR reduced the number of states from five to three, and greatly reduced the number of state transitions, which won up to 4% on a few benchmarks. However, it ended up causing some drastic regressions for some users, so in #67471 I reverted those changes. #60608: This issue suggests using FxIndexSet in some places where currently an FxHashMap plus a Vec are used. I tried it for the symbol table and it was a significant regression for a few benchmarks. Progress Since my last blog post, compile times have seen some more good improvements. The following screenshot shows wall-time changes on the benchmark suite since then (2019-12-08 to 2020-04-22). The biggest changes are in the synthetic stress tests await-call-tree-debug, wf-projection-stress-65510, and ctfe-stress-4, which aren’t representative of typical code and aren’t that important. Overall it’s good news, with many improvements (green), some in the double digits, and relatively few regressions (red). Many thanks to everybody who helped with all the performance improvements that landed during this period. [Less]
Posted over 3 years ago
This is part 2 of a deep-dive into the implementation details of Taskcluster’s backend data stores. Check out part 1 for the background, as we’ll jump right in here! Azure in Postgres As of the end of April, we had all of our data in a Postgres ... [More] database, but the data was pretty ugly. For example, here’s a record of a worker as recorded by worker-manager: partition_key | testing!2Fstatic-workers row_key | cc!2Fdd~ee!2Fff value | { "state": "requested", "RowKey": "cc!2Fdd~ee!2Fff", "created": "2015-12-17T03:24:00.000Z", "expires": "3020-12-17T03:24:00.000Z", "capacity": 2, "workerId": "ee/ff", "providerId": "updated", "lastChecked": "2017-12-17T03:24:00.000Z", "workerGroup": "cc/dd", "PartitionKey": "testing!2Fstatic-workers", "lastModified": "2016-12-17T03:24:00.000Z", "workerPoolId": "testing/static-workers", "__buf0_providerData": "eyJzdGF0aWMiOiJ0ZXN0ZGF0YSJ9Cg==", "__bufchunks_providerData": 1 } version | 1 etag | 0f6e355c-0e7c-4fe5-85e3-e145ac4a4c6c To reap the goodness of a relational database, that would be a “normal”[*] table: distinct columns, nice data types, and a lot less redundancy. All access to this data is via some Azure-shaped stored functions, which are also not amenable to the kinds of flexible data access we need: _load - load a single row _create - create a new row _remove - remove a row _modify - modify a row _scan - return some or all rows in the table [*] In the normal sense of the word – we did not attempt to apply database normalization. Database Migrations So the next step, which we dubbed “phase 2”, was to migrate this schema to one more appropriate to the structure of the data. The typical approach is to use database migrations for this kind of work, and there are lots of tools for the purpose. For example, Alembic and Django both provide robust support for database migrations – but they are both in Python. The only mature JS tool is knex, and after some analysis we determined that it both lacked features we needed and brought a lot of additional features that would complicate our usage. It is primarily a “query builder”, with basic support for migrations. Because we target Postgres directly, and because of how we use stored functions, a query builder is not useful. And the migration support in knex, while effective, does not support the more sophisticated approaches to avoiding downtime outlined below. We elected to roll our own tool, allowing us to get exactly the behavior we wanted. Migration Scripts Taskcluster defines a sequence of numbered database versions. Each version corresponds to a specific database schema, which includes the structure of the database tables as well as stored functions. The YAML file for each version specifies a script to upgrade from the previous version, and a script to downgrade back to that version. For example, an upgrade script might add a new column to a table, with the corresponding downgrade dropping that column. version: 29 migrationScript: |- begin alter table secrets add column last_used timestamptz; end downgradeScript: |- begin alter table secrets drop column last_used; end So far, this is a pretty normal approach to migrations. However, a major drawback is that it requires careful coordination around the timing of the migration and deployment of the corresponding code. Continuing the example of adding a new column, if the migration is deployed first, then the existing code may execute INSERT queries that omit the new column. If the new code is deployed first, then it will attempt to read a column that does not yet exist. There are workarounds for these issues. In this example, adding a default value for the new column in the migration, or writing the queries such that they are robust to a missing column. Such queries are typically spread around the codebase, though, and it can be difficult to ensure (by testing, of course) that they all operate correctly. In practice, most uses of database migrations are continuously-deployed applications – a single website or application server, where the developers of the application control the timing of deployments. That allows a great deal of control, and changes can be spread out over several migrations that occur in rapid succession. Taskcluster is not continuously deployed – it is released in distinct versions which users can deploy on their own cadence. So we need a way to run migrations when upgrading to a new Taskcluster release, without breaking running services. Stored Functions Part 1 mentioned that all access to data is via stored functions. This is the critical point of abstraction that allows smooth migrations, because stored functions can be changed at runtime. Each database version specifies definitions for stored functions, either introducing new functions or replacing the implementation of existing functions. So the version: 29 YAML above might continue with methods: create_secret: args: name text, value jsonb returns: '' body: |- begin insert into secrets (name, value, last_used) values (name, value, now()); end get_secret: args: name text returns: record body: |- begin update secrets set last_used = now() where secrets.name = get_secret.name; return query select name, value from secrets where secrets.name = get_secret.name; end This redefines two existing functions to operate properly against the new table. The functions are redefined in the same database transaction as the migrationScript above, meaning that any calls to create_secret or get_secret will immediately begin populating the new column. A critical rule (enforced in code) is that the arguments and return type of a function cannot be changed. To support new code that references the last_used value, we add a new function: get_secret_with_last_used: args: name text returns: record body: |- begin update secrets set last_used = now() where secrets.name = get_secret.name; return query select name, value, last_used from secrets where secrets.name = get_secret.name; end Another critical rule is that DB migrations must be applied fully before the corresponding version of the JS code is deployed. In this case, that means code that uses get_secret_with_last_used is deployed only after the function is defined. All of this can be thoroughly tested in isolation from the rest of the Taskcluster code, both unit tests for the functions and integration tests for the upgrade and downgrade scripts. Unit tests for redefined functions should continue to pass, unchanged, providing an easy-to-verify compatibility check. Phase 2 Migrations The migrations from Azure-style tables to normal tables are, as you might guess, a lot more complex than this simple example. Among the issues we faced: Azure-entities uses a multi-field base64 encoding for many data-types, that must be decoded (such as __buf0_providerData/__bufchunks_providerData in the example above) Partition and row keys are encoded using a custom variant of urlencoding that is remarkably difficult to implement in pl/pgsql Some columns (such as secret values) are encrypted. Postgres generates slightly different ISO8601 timestamps from JS’s Date.toJSON() We split the work of performing these migrations across the members of the Taskcluster team, supporting each other through the tricky bits, in a rather long but ultimately successful “Postgres Phase 2” sprint. 0042 - secrets phase 2 Let’s look at one of the simpler examples: the secrets service. The migration script creates a new secrets table from the data in the secrets_entities table, using Postgres’s JSON function to unpack the value column into “normal” columns. The database version YAML file carefully redefines the Azure-compatible DB functions to access the new secrets table. This involves unpacking function arguments from their JSON formats, re-packing JSON blobs for return values, and even some light parsing of the condition string for the secrets_entities_scan function. It then defines new stored functions for direct access to the normal table. These functions are typically similar, and more specific to the needs of the service. For example, the secrets service only modifies secrets in an “upsert” operation that replaces any existing secret of the same name. Step By Step To achieve an extra layer of confidence in our work, we landed all of the phase-2 PRs in two steps. The first step included migration and downgrade scripts and the redefined stored functions, as well as tests for those. But critically, this step did not modify the service using the table (the secrets service in this case). So the unit tests for that service use the redefined stored functions, acting as a kind of integration-test for their implementations. This also validates that the service will continue to run in production between the time the database migration is run and the time the new code is deployed. We landed this step on GitHub in such a way that reviewers could see a green check-mark on the step-1 commit. In the second step, we added the new, purpose-specific stored functions and rewrote the service to use them. In services like secrets, this was a simple change, but some other services saw more substantial rewrites due to more complex requirements. Deprecation Naturally, we can’t continue to support old functions indefinitely: eventually they would be prohibitively complex or simply impossible to implement. Another deployment rule provides a critical escape from this trap: Taskcluster must be upgraded at most one major version at a time (e.g., 36.x to 37.x). That provides a limited window of development time during which we must maintain compatibility. Defining that window is surprisingly tricky, but essentially it’s two major revisions. Like the software engineers we are, we packaged up that tricky computation in a script, and include the lifetimes in some generated documentation What’s Next? This post has hinted at some of the complexity of “phase 2”. There are lots of details omitted, of course! But there’s one major detail that got us in a bit of trouble. In fact, we were forced to roll back during a planned migration – not an engineer’s happiest moment. The queue_tasks_entities and queue_artifacts_entities table were just too large to migrate in any reasonable amount of time. Part 3 will describe what happened, how we fixed the issue, and what we’re doing to avoid having the same issue again. [Less]
Posted over 3 years ago
This is part 2 of a deep-dive into the implementation details of Taskcluster’s backend data stores. Check out part 1 for the background, as we’ll jump right in here! Azure in Postgres As of the end of April, we had all of our data in a Postgres ... [More] database, but the data was pretty ugly. For example, here’s a record of a worker as recorded by worker-manager: partition_key | testing!2Fstatic-workers row_key | cc!2Fdd~ee!2Fff value | { "state": "requested", "RowKey": "cc!2Fdd~ee!2Fff", "created": "2015-12-17T03:24:00.000Z", "expires": "3020-12-17T03:24:00.000Z", "capacity": 2, "workerId": "ee/ff", "providerId": "updated", "lastChecked": "2017-12-17T03:24:00.000Z", "workerGroup": "cc/dd", "PartitionKey": "testing!2Fstatic-workers", "lastModified": "2016-12-17T03:24:00.000Z", "workerPoolId": "testing/static-workers", "__buf0_providerData": "eyJzdGF0aWMiOiJ0ZXN0ZGF0YSJ9Cg==", "__bufchunks_providerData": 1 } version | 1 etag | 0f6e355c-0e7c-4fe5-85e3-e145ac4a4c6c To reap the goodness of a relational database, that would be a “normal”[*] table: distinct columns, nice data types, and a lot less redundancy. All access to this data is via some Azure-shaped stored functions, which are also not amenable to the kinds of flexible data access we need: _load - load a single row _create - create a new row _remove - remove a row _modify - modify a row _scan - return some or all rows in the table [*] In the normal sense of the word – we did not attempt to apply database normalization. Database Migrations So the next step, which we dubbed “phase 2”, was to migrate this schema to one more appropriate to the structure of the data. The typical approach is to use database migrations for this kind of work, and there are lots of tools for the purpose. For example, Alembic and Django both provide robust support for database migrations – but they are both in Python. The only mature JS tool is knex, and after some analysis we determined that it both lacked features we needed and brought a lot of additional features that would complicate our usage. It is primarily a “query builder”, with basic support for migrations. Because we target Postgres directly, and because of how we use stored functions, a query builder is not useful. And the migration support in knex, while effective, does not support the more sophisticated approaches to avoiding downtime outlined below. We elected to roll our own tool, allowing us to get exactly the behavior we wanted. Migration Scripts Taskcluster defines a sequence of numbered database versions. Each version corresponds to a specific database schema, which includes the structure of the database tables as well as stored functions. The YAML file for each version specifies a script to upgrade from the previous version, and a script to downgrade back to that version. For example, an upgrade script might add a new column to a table, with the corresponding downgrade dropping that column. version: 29 migrationScript: |- begin alter table secrets add column last_used timestamptz; end downgradeScript: |- begin alter table secrets drop column last_used; end So far, this is a pretty normal approach to migrations. However, a major drawback is that it requires careful coordination around the timing of the migration and deployment of the corresponding code. Continuing the example of adding a new column, if the migration is deployed first, then the existing code may execute INSERT queries that omit the new column. If the new code is deployed first, then it will attempt to read a column that does not yet exist. There are workarounds for these issues. In this example, adding a default value for the new column in the migration, or writing the queries such that they are robust to a missing column. Such queries are typically spread around the codebase, though, and it can be difficult to ensure (by testing, of course) that they all operate correctly. In practice, most uses of database migrations are continuously-deployed applications – a single website or application server, where the developers of the application control the timing of deployments. That allows a great deal of control, and changes can be spread out over several migrations that occur in rapid succession. Taskcluster is not continuously deployed – it is released in distinct versions which users can deploy on their own cadence. So we need a way to run migrations when upgrading to a new Taskcluster release, without breaking running services. Stored Functions Part 1 mentioned that all access to data is via stored functions. This is the critical point of abstraction that allows smooth migrations, because stored functions can be changed at runtime. Each database version specifies definitions for stored functions, either introducing new functions or replacing the implementation of existing functions. So the version: 29 YAML above might continue with methods: create_secret: args: name text, value jsonb returns: '' body: |- begin insert into secrets (name, value, last_used) values (name, value, now()); end get_secret: args: name text returns: record body: |- begin update secrets set last_used = now() where secrets.name = get_secret.name; return query select name, value from secrets where secrets.name = get_secret.name; end This redefines two existing functions to operate properly against the new table. The functions are redefined in the same database transaction as the migrationScript above, meaning that any calls to create_secret or get_secret will immediately begin populating the new column. A critical rule (enforced in code) is that the arguments and return type of a function cannot be changed. To support new code that references the last_used value, we add a new function: get_secret_with_last_used: args: name text returns: record body: |- begin update secrets set last_used = now() where secrets.name = get_secret.name; return query select name, value, last_used from secrets where secrets.name = get_secret.name; end Another critical rule is that DB migrations must be applied fully before the corresponding version of the JS code is deployed. In this case, that means code that uses get_secret_with_last_used is deployed only after the function is defined. All of this can be thoroughly tested in isolation from the rest of the Taskcluster code, both unit tests for the functions and integration tests for the upgrade and downgrade scripts. Unit tests for redefined functions should continue to pass, unchanged, providing an easy-to-verify compatibility check. Phase 2 Migrations The migrations from Azure-style tables to normal tables are, as you might guess, a lot more complex than this simple example. Among the issues we faced: Azure-entities uses a multi-field base64 encoding for many data-types, that must be decoded (such as __buf0_providerData/__bufchunks_providerData in the example above) Partition and row keys are encoded using a custom variant of urlencoding that is remarkably difficult to implement in pl/pgsql Some columns (such as secret values) are encrypted. Postgres generates slightly different ISO8601 timestamps from JS’s Date.toJSON() We split the work of performing these migrations across the members of the Taskcluster team, supporting each other through the tricky bits, in a rather long but ultimately successful “Postgres Phase 2” sprint. 0042 - secrets phase 2 Let’s look at one of the simpler examples: the secrets service. The migration script creates a new secrets table from the data in the secrets_entities table, using Postgres’s JSON function to unpack the value column into “normal” columns. The database version YAML file carefully redefines the Azure-compatible DB functions to access the new secrets table. This involves unpacking function arguments from their JSON formats, re-packing JSON blobs for return values, and even some light parsing of the condition string for the secrets_entities_scan function. It then defines new stored functions for direct access to the normal table. These functions are typically similar, and more specific to the needs of the service. For example, the secrets service only modifies secrets in an “upsert” operation that replaces any existing secret of the same name. Step By Step To achieve an extra layer of confidence in our work, we landed all of the phase-2 PRs in two steps. The first step included migration and downgrade scripts and the redefined stored functions, as well as tests for those. But critically, this step did not modify the service using the table (the secrets service in this case). So the unit tests for that service use the redefined stored functions, acting as a kind of integration-test for their implementations. This also validates that the service will continue to run in production between the time the database migration is run and the time the new code is deployed. We landed this step on GitHub in such a way that reviewers could see a green check-mark on the step-1 commit. In the second step, we added the new, purpose-specific stored functions and rewrote the service to use them. In services like secrets, this was a simple change, but some other services saw more substantial rewrites due to more complex requirements. Deprecation Naturally, we can’t continue to support old functions indefinitely: eventually they would be prohibitively complex or simply impossible to implement. Another deployment rule provides a critical escape from this trap: Taskcluster must be upgraded at most one major version at a time (e.g., 36.x to 37.x). That provides a limited window of development time during which we must maintain compatibility. Defining that window is surprisingly tricky, but essentially it’s two major revisions. Like the software engineers we are, we packaged up that tricky computation in a script, and include the lifetimes in some generated documentation What’s Next? This post has hinted at some of the complexity of “phase 2”. There are lots of details omitted, of course! But there’s one major detail that got us in a bit of trouble. In fact, we were forced to roll back during a planned migration – not an engineer’s happiest moment. The queue_tasks_entities and queue_artifacts_entities table were just too large to migrate in any reasonable amount of time. Part 3 will describe what happened, how we fixed the issue, and what we’re doing to avoid having the same issue again. [Less]
Posted over 3 years ago by Daniel Stenberg
The other day we celebrated everything curl turning 5 years old, and not too long after that I got myself this printed copy of the Chinese translation in my hands! This version of the book is available for sale on Amazon and the translation was ... [More] done by the publisher. The book’s full contents are available on github and you can read the English version online on ec.haxx.se. If you would be interested in starting a translation of the book into another language, let me know and I’ll help you get started. Currently the English version consists of 72,798 words so it’s by no means an easy feat to translate! My other two other smaller books, http2 explained and HTTP/3 explained have been translated into twelve(!) and ten languages this way (and there might be more languages coming!). A collection of printed works authored by yours truly! Inside the Chinese version – yes I can understand some headlines! Unfortunately I don’t read Chinese so I can’t tell you hos good the translation is! [Less]
Posted over 3 years ago by Daniel Stenberg
The other day we celebrated everything curl turning 5 years old, and not too long after that I got myself this printed copy of the Chinese translation in my hands! This version of the book is available for sale on Amazon and the translation was ... [More] done by the publisher. The book’s full contents are available on github and you can read the English version online on ec.haxx.se. If you would be interested in starting a translation of the book into another language, let me know and I’ll help you get started. Currently the English version consists of 72,798 words so it’s by no means an easy feat to translate! My other two other smaller books, http2 explained and HTTP/3 explained have been translated into twelve(!) and ten languages this way (and there might be more languages coming!). A collection of printed works authored by yours truly! Inside the Chinese version – yes I can understand some headlines! Unfortunately I don’t read Chinese so I can’t tell you how good the translation is! [Less]
Posted over 3 years ago by Scott DeVaney
Recommended extensions—a curated list of extensions that meet Mozilla’s highest standards of security, functionality, and user experience—are in part selected with input from a rotating editorial board of community contributors. Each board runs for ... [More] six consecutive months and evaluates a small batch of new Recommended candidates each month. The board’s evaluation plays a critical role in helping identify new potential Recommended additions. We are now accepting applications for community board members through 18 November. If you would like to nominate yourself for consideration on the board, please email us at amo-featured [at] mozilla [dot] org and provide a brief explanation why you feel you’d make a keen evaluator of Firefox extensions. We’d love to hear about how you use extensions and what you find so remarkable about browser customization. You don’t have to be an extension developer to effectively evaluate Recommended candidates (though indeed many past board members have been developers themselves), however you should have a strong familiarity with extensions and be comfortable assessing the strengths and flaws of their functionality and user experience. Selected contributors will participate in a six-month project that runs from December – May. Here’s the entire collection of Recommended extensions, if curious to explore what’s currently curated. Thank you and we look forward to hearing from interested contributors by the 18 November application deadline! The post Contribute to selecting new Recommended extensions appeared first on Mozilla Add-ons Blog. [Less]