|
Posted
almost 4 years
ago
I’m pleased to announce the release of 0.9.0 of librest, a library meant to interact with “Restful” web services. This library is very old and not really big but it handles the aspect of interaction with REST-APIs in a convenient fashion. After a
... [More]
long period of maintenance state i picked it up and brought it into 2022. Most of the deprecated API calls are gone now and it should be now possible to parallel-install librest with the previous release. [Less]
|
|
Posted
almost 4 years
ago
I’m attending the https://linux.conf.au/ conference online this weekend, which is always a good opportunity for some sideline hacking.
I found something boneheaded doing that today.
There have been a few times while inventing the OpenHMD Rift
... [More]
driver where I’ve noticed something strange and followed the thread until it made sense. Sometimes that leads to improvements in the driver, sometimes not.
In this case, I wanted to generate a graph of how long the computer vision processing takes – from the moment each camera frame is captured until poses are generated for each device.
To do that, I have a some logging branches that output JSON events to log files and I write scripts to process those. I used that data and produced:
Pose recognition latency.dt = interpose spacing, delay = frame to pose latency
Two things caught my eye in this graph. The first is the way the baseline latency (pink lines) increases from ~20ms to ~58ms. The 2nd is the quantisation effect, where pose latencies are clearly moving in discrete steps.
Neither of those should be happening.
Camera frames are being captured from the CV1 sensors every 19.2ms, and it takes that 17-18ms for them to be delivered across the USB. Depending on how many IR sources the cameras can see, figuring out the device poses can take a different amount of time, but the baseline should always hover around 17-18ms because the fast “device tracking locked” case take as little as 1ms.
Did you see me mention 19.2ms as the interframe period? Guess what the spacing on those quantisation levels are in the graph? I recognised it as implying that something in the processing is tied to frame timing when it should not be.
OpenHMD Rift CV1 tracking timing
This 2nd graph helped me pinpoint what exactly was going on. This graph is cut from the part of the session where the latency has jumped up. What it shows is a ~1 frame delay between when the frame is received (frame-arrival-finish-local-ts) before the initial analysis even starts!
That could imply that the analysis thread is just busy processing the previous frame and doesn’t get start working on the new one yet – but the graph says that fast analysis is typically done in 1-10ms at most. It should rarely be busy when the next frame arrives.
This is where I found the bone headed code – a rookie mistake I wrote when putting in place the image analysis threads early on in the driver development and never noticed.
There are 3 threads involved:
USB service thread, reading video frame packets and assembling pixels in framebuffers
Fast analysis thread, that checks tracking lock is still acquired
Long analysis thread, which does brute-force pose searching to reacquire / match unknown IR sources to device LEDs
These 3 threads communicate using frame worker queues passing frames between each other. Each analysis thread does this pseudocode:
while driver_running:
Pop a frame from the queue
Process the frame
Sleep for new frame notification
The problem is in the 3rd line. If the driver is ever still processing the frame in line 2 when a new frame arrives – say because the computer got really busy – the thread sleeps anyway and won’t wake up until the next frame arrives. At that point, there’ll be 2 frames in the queue, but it only still processes one – so the analysis gains a 1 frame latency from that point on. If it happens a second time, it gets later by another frame! Any further and it starts reclaiming frames from the queues to keep the video capture thread fed – but it only reclaims one frame at a time, so the latency remains!
The fix is simple:
while driver_running:
Pop a frame
Process the frame
if queue_is_empty():
sleep for new frame notification
Doing that for both the fast and long analysis threads changed the profile of the pose latency graph completely.
Pose latency and inter-pose spacing after fix
This is a massive win! To be clear, this has been causing problems in the driver for at least 18 months but was never obvious from the logs alone. A single good graph is worth a thousand logs.
What does this mean in practice?
The way the fusion filter I’ve built works, in between pose updates from the cameras, the position and orientation of each device are predicted / updated using the accelerometer and gyro readings. Particularly for position, using the IMU for prediction drifts fairly quickly. The longer the driver spends ‘coasting’ on the IMU, the less accurate the position tracking is. So, the sooner the driver can get a correction from the camera to the fusion filter the less drift we’ll get – especially under fast motion. Particularly for the hand controllers that get waved around.
Before: Left Controller pose delays by sensor
After: Left Controller pose delays by sensor
Poses are now being updated up to 40ms earlier and the baseline is consistent with the USB transfer delay.
You can also visibly see the effect of the JPEG decoding support I added over Christmas. The ‘red’ camera is directly connected to USB3, while the ‘khaki’ camera is feeding JPEG frames over USB2 that then need to be decoded, adding a few ms delay.
The latency reduction is nicely visible in the pose graphs, where the ‘drop shadow’ effect of pose updates tailing fusion predictions largely disappears and there are fewer large gaps in the pose observations when long analysis happens (visible as straight lines jumping from point to point in the trace):
Before: Left Controller poses
After: Left Controller poses
[Less]
|
|
Posted
almost 4 years
ago
One thing that I’ve seen confuse newcomers to writing GObject-based Rust code is the glib::clone!() macro. It’s foreign to people coming from writing normal Rust code trying to write GObject-based code, and it’s foreign to many people used to
... [More]
writing GObject-based code in other languages (e.g. C, Python, JavaScript, and Vala). Over the years I’ve explained it a few times, and I figure now that I should write a blog post that I can point people to describing what the clone!() macro is, what it does, and why we need it in detail.
Closures and Clones in Plain Rust
Rust has a nifty thing called a closure. To quote the official Rust book:
…closures are anonymous functions you can save in a variable or pass as arguments to other functions. You can create the closure in one place and then call the closure to evaluate it in a different context. Unlike functions, closures can capture values from the scope in which they’re defined.
Simply put, a closure is a function you can use as a variable or an argument to another function. Closures can “capture” variables from the environment, meaning that you can easily pass variables within your scope without needing to pass them as arguments. Here’s an example of capturing:
let num = 1;
let num_closure = move || {
println!("Num times 2 is {}", num * 2); // `num` captured here
};
num_closure();
num is an i32, or a signed 32-bit integer. Integers are cheap, statically sized primitives, and they don’t require any special behavior when they are dropped. Because of this, it’s safe to keep using them after a move – so the type can and does implement the Copy trait. In practice, that means we can use our integer after the closure captures it, as it captures a copy. So we can have:
// Everything above stays the same
num_closure();
println!("Num is {}", num);
And the compiler will be happy with us. What happens if you need something dynamically sized and stored on the heap, like the data from a String? If we try this pattern with a String:
let string = String::from("trust");
let string_closure = move || {
println!("String contains \"rust\": {}", string.contains("rust"));
};
string_closure();
println!("String is \"{}\"", string);
We get the following error:
error[E0382]: borrow of moved value: `string`
--> src/main.rs:10:34
|
4 | let string = String::from("trust");
| ------ move occurs because `string` has type `String`, which does not implement the `Copy` trait
5 | let string_closure = move || {
| ------- value moved into closure here
6 | println!("String contains \"rust\": {}", string.contains("rust"));
| ------ variable moved due to use in closure
...
10 | println!("String is \"{}\"", string);
| ^^^^^^ value borrowed here after move
Values of the String type cannot be copied, so the compiler instead “moves” our string, giving the closure ownership. In Rust, only one thing can have ownership of a value. So when the closure captures string, our outer scope no longer has access to it. That doesn’t mean we can’t use string in our closure, though. We just need to be more explicit about how it should be handled.
Rust provides the Clone trait that we can implement for objects like this. Clone provides the clone() method, which explicitly duplicates an object. Types that implement Clone but not Copy are generally types that can be of an arbitrary size, and are stored in the heap. Values of the String type can be vary in size, which is why it falls into this category. When you call clone(), usually you are creating a new full copy of the object’s data on the heap. So, we want to create a clone, and only pass that clone into the closure:
let s = string.clone();
let string_closure = move || {
println!("String contains \"rust\": {}", s.contains("rust"));
};
The closure will only capture our clone, and we can still use the original in our original scope.
If you need more information on cloning and ownership, I recommend reading the “Understanding Ownership” chapter of the official Rust book.
Reference Counting, Abbreviated
When working with types of an arbitary size, we may have types that are too large to efficiently clone(). For these types, we can use reference counting. In Rust, there are two types for this you’re likely to use: Rc for single-threaded contexts, and Arc for multi-threaded contexts. For now let’s focus on Rc.
When working with reference-counted types, the reference-counted object is kept alive for as long as anything holds a “strong” reference. Rc creates a new Rc instance when you call .clone() and increments the number of strong references instead of creating a full copy. The number of strong references is decreased when an instance of Rc goes out of scope. An Rc can often be used in contexts the reference &T is used. Particularly, calling a method that takes &self on an Rc will call the method on the underlying T. For example, some_string.as_str() would work the same if some_string were a String or an Rc.
For our example, we can simply wrap our String constructor with Rc::new():
let string = Rc::new(String::from("trust"));
let s = string.clone();
let string_closure = move || {
println!("String contains \"rust\": {}", s.contains("rust"));
};
string_closure();
println!("String is \"{}\"", string);
With this, we can capture and use larger values without creating expensive copies. There are some consequences to naively using clone(), and we’ll get into those below, but in a slightly different context.
Closures and Copies in GObject-based Rust
When working with GObject-based Rust, particularly gtk-rs, closures come up most often when working with signals. Signals are a GObject concept. To (over)simplify, signals are used to react to and modify object-specific events. For more detail I recommend reading the “Signals” section in the “Type System Concepts” documentation. Here’s what you need to know:
Signals are emitted by objects.
Signals can carry data in the form of parameters that connections may use.
Signals can expect their handlers to have a return type that’s used elsewhere.
Let’s take a look at how this works with a C example. Say we have a GtkButton, and we want to react when the button is clicked. Most code will use the g_signal_connect () function macro to register a signal handler. g_signal_connect () takes 4 parameters:
The GObject that we expect to emit the signal
The name of the signal
A GCallback that is compatible with the signal’s parameters
data, which is a pointer to a struct.
The object here is our GtkButton instance. The signal we want to connect to is the “clicked” signal. The signal expects a callback with the signature of void clicked (GtkButton *self, gpointer user_data). So we need to write a function that has that signature. user_data here corresponds to the data parameter that we give g_signal_connect (). With all of that in mind, here’s what connecting to the signal would typically look like in C:
void
button_clicked_cb (GtkButton *button,
gpointer user_data)
{
MyObject *self = MY_OBJECT (user_data);
my_object_do_something_with_button (self, button);
}
static void
my_object_some_setup (MyObject *self)
{
GtkWidget *button = gtk_button_new_with_label ("Do Something");
g_signal_connect (button, "clicked",
G_CALLBACK (button_clicked_cb), self);
my_object_add_button (button); // Assume this does something to keep button alive
}
This is the simplest way to handle connecting to the signal. But we have an issue with this setup: what if we want to pass multiple values to the callback, that aren’t necessarily a part of MyObject? You would need to create a custom struct that’s houses each value you want to pass, use that struct as data, and read each field of that struct within your callback.
Instead of having to create a struct for each callback that needs to take multiple arguments, in Rust we can and do use closures. The gtk-rs bindings are nice in that they have generated functions for each signal a type can emit. So for gtk::Button we have connect_clicked (). These generated functions take a closure as an argument, with the closure taking the same arguments that the signal expects – except user_data. However, because Rust closures can capture variables, we don’t need user_data – the closure essentially becomes a struct containing captured variables, and the pointer to it becomes user_data. So, let’s try to do a direct port of the functions above, and condense them down to one function with a closure inside:
impl MyObject {
pub fn some_setup(&self) {
let button = gtk::Button::with_label("Do Something");
button.connect_clicked(move |btn| {
self.do_something_with_button(btn);
});
self.add_button(button);
}
}
This looks pretty nice, right? The catch is, it doesn’t compile:
error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
--> src/lib.rs:33:36
|
30 | pub fn some_setup(&self) {
| ----- this data with an anonymous lifetime `'_`...
...
33 | button.connect_clicked(move |btn| {
| ____________________________________^
34 | | self.do_something_with_button(btn);
35 | | });
| |_____________^ ...is captured here...
|
note: ...and is required to live as long as `'static` here
--> src/lib.rs:33:20
|
33 | button.connect_clicked(move |btn| {
| ^^^^^^^^^^^^^^^
Lifetimes can be a bit confusing, so I’ll try to simplify. &self is a reference to our object. It’s like the C pointer MyObject *self, except it has guarantees that C pointers don’t have: notably, they must always be valid where they are used. The compiler is telling us that by the time our closure runs – which could be any point where button is alive – our reference may not be valid, because our &self method argument (by declaration) only lives to the end of the method. There are a few ways to solve this: change the lifetime of our reference and ensure it matches the closure’s lifetime, or to find a way to pass an owned object to the closure.
Lifetimes are complex – I don’t recommend worrying about them unless you really need the extra performance from using references everywhere. There’s a big complication with trying to work with lifetimes here: our closure has a specific lifetime bound. If we take a look at the function signature for connect_clicked():
fn connect_clicked<F: Fn(&Self) + 'static>(&self, f: F) -> SignalHandlerId
We can see that the closure (and thus everything captured by the closure) has the 'static lifetime. This can mean different things in different contexts, but here that means that the closure needs to be able to hold onto the type for as long as it wants. For more detail, see “Rust by Example”’s chapter on the static lifetime. So, the only option is for the closure to own the objects it captures.
The trick to giving ownership to something you don’t necessarily own is to duplicate it. Remember clone()? We can use that here. You might think it’s expensive to clone your object, especially if it’s a large and complex widget, like your main window. There’s something very nice about GObjects though: all GObjects are reference-counted. So, cloning a GObject instance is like cloning an Rc instance. Instead of making a full copy, the amount of strong references increases. So, we can change our code to use clone just like we did in our original String example:
pub fn some_setup(&self) {
let button = gtk::Button::with_label("Do Something");
let s = self.clone();
button.connect_clicked(move |btn| {
s.do_something_with_button(btn);
});
self.add_button(button);
}
All good, right? Unfortunately, no. This might look innocent, and in some programs cloning like this might cause any issues. What if button wasn’t owned by MyObject? Take this version of the function:
pub fn some_setup(&self, button: >k::Button) {
let s = self.clone();
button.connect_clicked(move |btn| {
s.do_something_with_button(btn);
});
}
button is now merely passed to some_setup(). It may be owned by some other widget that may be alive for much longer than we want MyObject to be alive. Think back to the description of reference counting: objects are kept alive for as long as a strong reference exists. We’ve given a strong reference to the closure we attached to the button. That means MyObject will be forcibly kept alive for as long as the closure is alive, which is potentially as long as button is alive. MyObject and the memory associated with it may never be cleaned up, and that gets more problematic the bigger MyObject is and the more instances we have.
Now, we can structure our program differently to avoid this specific case, but for now let’s continue using it as an example. How do we keep our closure from controlling the lifetime of MyObject when we need to be able to use MyObject when the closure runs? Well, in addition to “strong” references, reference counting has the concept of “weak” references. The amount of weak references an object has is tracked, but it doesn’t need to be 0 in order for the object to be dropped. With an Rc instance we’d use Rc::downgrade() to get a Weak, and with a GObject we use ObjectExt::downgrade() to get a WeakRef. In order to turn a weak reference back into a usable instance of an object we need to “upgrade” it. Upgrading a weak reference can fail, since weak references do not keep the referenced object alive. So Weak::upgrade() returns an Option>, and WeakRef returns an Option. Because it’s optional, we should only move forward if T still exists.
Let’s rework our example to use weak references. Since we only care about doing something when the object still exists, we can use if let here:
pub fn some_setup(&self, button: >k::Button) {
let s = self.downgrade();
button.connect_clicked(move |btn| {
if let Some(obj) = s.upgrade() {
obj.do_something_with_button(btn);
}
});
}
Only two more lines, but a little more annoying than just calling clone(). Now, what if we have another widget we need to capture?
pub fn some_setup(&self, button: >k::Button, widget: &OtherWidget) {
let s = self.downgrade();
let w = widget.downgrade();
button.connect_clicked(move |btn| {
if let (Some(obj), Some(widget)) = (s.upgrade(), w.upgrade()) {
obj.do_something_with_button(btn);
widget.set_visible(false);
}
});
}
That’s getting harder to parse. Now, what if the closure needed a return value? Let’s say it should return a boolean. We need to handle our intended behavior when MyObject and OtherWidget still exist, and we need to handle the fallback for when it doesn’t:
pub fn some_setup(&self, button: >k::Button, widget: &OtherWidget) {
let s = self.downgrade();
let w = widget.downgrade();
button.connect_clicked(move |btn| {
if let (Some(obj), Some(widget)) = (s.upgrade(), w.upgrade()) {
obj.do_something_with_button(btn);
widget.visible()
} else {
false
}
});
}
Now we have something pretty off-putting. If we want to avoid keeping around unwanted objects or potential reference cycles, this will get worse for every object we want to capture. Thankfully, we don’t have to write code like this.
Enter the glib::clone!() Macro
The glib crate provides a macro to solve all of these cases. The macro takes the variables you want to capture as @weak or @strong, and the capture behavior corresponds to upgrading/downgrading and calling clone(), respectively. So, starting with the example behavior that kept MyObject around, if we really wanted that we would write the function like this:
pub fn some_setup(&self, button: >k::Button) {
button.connect_clicked(clone!(@strong self as s => move |btn| {
s.do_something_with_button(btn);
}));
}
We use self as s because self is a keyword in Rust. We don’t need to rename a variable unless it’s a keyword or some field (e.g. foo.bar as bar). Here, glib::clone!() doesn’t prevent us from holding onto s forever, but it does provide a nicer way of doing it should we want to. If we want to use a weak reference instead, it would be:
button.connect_clicked(clone!(@weak self as s => move |btn| {
s.do_something_with_button(btn);
}));
Just one word and we no longer have to worry about MyObject sticking around when it shouldn’t. For the example with multiple captures, we can use comma separation to pass multiple variables:
pub fn some_setup(&self, button: >k::Button, widget: &OtherWidget) {
button.connect_clicked(clone!(@weak self as s, @weak widget => move |btn| {
s.do_something_with_button(btn);
widget.set_visible(false);
}));
}
Very nice. It’s also simple to provide a fallback for return values:
button.connect_clicked(clone!(@weak self as s, @weak widget => @default-return false, move |btn| {
s.do_something_with_button(btn);
widget.visible()
}));
Now instead of spending time and code on using weak references and fall back correctly, we can rely on glib::clone!() to handle it for us succinctly.
There’s are a few caveats to using glib::clone!(). Errors in your closures may be harder to spot, as the compiler may point to the site of the macro, instead of the exact site of the error. rustfmt also can’t format the contents inside the macro. For that reason, if your closure is getting too long I would recommend separating the behavior into a proper function and calling that.
Overall, I recommend using glib::clone!() when working on gtk-rs codebases. I hope this post helps you understand what it’s doing when you come across it, and that you know when you should use it. [Less]
|
|
Posted
almost 4 years
ago
Update on what happened across the GNOME project in the week from January 07 to January 14.
Core Apps and Libraries
GNOME Contacts
Keep and organize your contacts information.
nielsdg says
GNOME Contacts has been ported to GTK 4 and libadwaita
... [More]
, making sure it nicely fits in with a lot of other core apps in GNOME 42.
Mutter
A Wayland display server and X11 window manager and compositor library.
robert.mader says
Thanks to Jonas Ådahl we now support the new Wayland dmabuf feedback protocol. The protocol (for communication between clients and Mutter) together with some improvements to Mutters native backend (communication between Mutter and the kernel) allows a number of optimizations. In Gnome 42 for example, this will allow us to use direct scanout with most fullscreen OpenGL or Vulkan clients. Something we already supported in recent versions, however only in very selective cases. You can think of this as a more sophisticated version of X11 unredirect, notably without tearing.
What does this mean for users? The obvious part is that it will squeeze some more FPS out of GPUs when running games. To me, the even more important part is that it will help reduce energy consumption and thus increase battery life for e.g. video players. When playing a fullscreen video, doing a full size extra copy of every frame takes up a significant part of GPU time and skipping that allows the hardware to clock down.
What does this mean for developers? Fortunately, support for this protocol is build into OpenGL and Vulkan drivers. I personally spent a good chunk of time over the last two years helping to make Firefox finally use OpenGL by default. Now I’m very pleased to get this efficiency boost for free. Similarly, if you consider porting your app from GTK3 to GTK4 (the later using OpenGL by default) this might be a further incentive to do so.
What next? In future versions of Gnome we plan to support scanout for non-fullscreen windows. Also, users with multi-GPU devices can expect to benefit significantly from further improvements.
Libadwaita
Building blocks for modern GNOME apps using GTK4.
Alexander Mikhaylenko reports
Builder and Logs now support the upcoming dark style preference.
GJS
Use the GNOME platform libraries in your JavaScript programs. GJS powers GNOME Shell, Polari, GNOME Documents, and many other apps.
ptomato announces
In GJS this week:
Evan Welsh made GObject interfaces enumerable, so you can now do things like Object.keys(Gio.File.prototype) and get a list of the methods, like you can with other GObject types.
Evan also fixed a memory leak with callbacks.
Marco Trevisan and myself landed a large refactor involving type safety.
Chun-wei Fan kept everything buildable on Windows.
Thanks to Sonny Piers, Sergei Trofimovich, and Eli Schwartz for various other contributions.
Cantarell
Jakub Steiner says
GNOME’s UI typeface, Cantarell, has gotten a new minisite at cantarell.gnome.org. We finally have a canonical place for the font binary downloads, but the site also demos the extensive weight coverage of the variable font. I’m happy the typeface now has a respectable home for the amount of work Nikolaus Waxweiler has poured into it in the past few years. Thank you!
Circle Apps and Libraries
Secrets
A password manager which makes use of the KeePass v.4 format.
Maximiliano says
Secrets, formerly known as Password Safe, version 6.0 was just released, featuring the recent GTK 4 port, libadwaita, and OTP support. Due to the rename now it is under org.gnome.World.Secrets in Flathub.
gtk-rs
Safe bindings to the Rust language for fundamental libraries from the GNOME stack.
Bilal Elmoussaoui announces
gtk4-rs has now a Windows MSVC CI pipeline. This will ensure the bindings builds just fine and avoid regressions for Windows users that want to build applications using GTK4 and Rust.
Gaphor
A simple UML and SysML modeling tool.
Arjan announces
In our upcoming release Gaphor, based on popular demand, we now support diagram types! If you create an activity diagram, for example, it adds diagram info to the upper left of the diagram and collapses the toolbox to only show the relevant tools for that diagram.
Fragments
Easy to use BitTorrent client.
Felix announces
I added context menus to Fragments to make common actions easier and faster to perform. These are primarily intended for desktop users, but can also be activated on touchscreens by long pressing and holding.
Commit
An editor that helps you write better Git and Mercurial commit messages.
sonnyp announces
Commit message editor now use GtkSourceView which allows for new features and improvements. It’s also now available to translate on Weblate.
Third Party Projects
sonnyp announces
Tobias Bernard and I started working on Playhouse an HTML/CSS/JavaScript playground for GNOME.
There is no release yet but contributions and feedback welcome.
Powered by GTK 4, GJS, libadwaita , GtkSourceView and WebKitGTK !
Corentin Noël announces
We are happy to announce the first public alpha release of libshumate, the GTK4 Map widget library announced in 2019. This first unstable release contains everything one need to embed a minimal Map view. This library completely replaces libchamplain which was using Clutter and now provides a native way to control maps in GTK4. Application developers are encouraged to use libshumate and report any issue that might occur or any missing feature to the library.
flxzt announces
I have been working on it for a while, but now it is ready for an announcement: Rnote is a vector-based drawing app to create handwritten notes and to annotate pictures and PDFs. It features an endless sheet, different pen types with stylus pressure support, shapes and tools. It also has an integrated workspace browser and you can choose between different background colors and patterns. It can be downloaded as a flatpak from flathub
dabrain34 announces
GstPipelineStudio aims to provide a graphical user interface to the GStreamer framework. From a first step in the framework with a simple pipeline to a complex pipeline debugging, the tool provides a friendly interface to add elements to a pipeline and debug it.
Phosh
A pure wayland shell for mobile devices.
Guido says
Panzer Sajt added support for non-numeric passwords to phosh. Some bits of Sam Hewitt’s ongoing style refresh is also already visible in the video as is the new VPN indicator in the top-bar:
Your browser doesn't support embedded videos, but don't worry, you can download it and watch it with your favorite video player!
Documentation
Emmanuele Bassi announces
I merged the initial batch of beginner tutorials for the GNOME Developer Documentation website. They are meant to be used as a bridge between the HIG and API references, providing useful information about UI elements with code examples in multiple programming languages. More to come in the future!
Miscellaneous
Sophie Herold announces
The app pages on apps.gnome.org are now coming with a more exciting header design. Further, page rendering times have been optimized and a few issues with right-to-left scripts have been fixed. The latter surfaced with the newly added Hebrew translation.
That’s all for this week!
See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects! [Less]
|
|
Posted
almost 4 years
ago
After a lot of hard work, libadwaita 1.0 was released on the last day of 2021. If you haven’t already, check out Alexander’s announcement, which covers a lot of what’s in the new release.
When we rewrote the HIG back in May 2021, the new version
... [More]
expected and recommended libadwaita. However, libadwaita evolved between then and 1.0, so changes were needed to bring the HIG up to date.
Therefore, over the last two or three weeks, I’ve been working on updating the HIG to cover libadwaita 1.0. Hopefully this will mean that developers who are porting to GTK 4 and libadwaita have everything that they need in terms of design documentation but, if anything isn’t clear, do reach out using the usual GNOME design channels.
In the rest of this post, I’ll review what’s changed in the HIG, compared with the previous version.
What’s changed
There’s a bunch of new content in the latest HIG version, which reflects additional capabilities that are present in libadwaita 1.0. This includes material on:
How to style header bar buttons
Pill and circular styles button styles
Boxed list convenience widgets (action rows, combo rows, and expander rows)
Toasts
Dark mode
There have also been updates to existing content: all screenshots have been updated to use the latest UI style from libadwaita, and the guidelines on UI styling have been updated, to reflect the flexibility that comes with libadwaita’s new stylesheet.
As you might expect, there have been some general improvements to the HIG, which are unrelated to libadwaita. The page on navigation has been improved, to make it more accessible. A page on selection mode has also been added (we used to have this documented, then dropped the documentation while the pattern was updated). There has also been a large number of small style and structure changes, which should make the HIG an easier read.
If you spot any issues, the HIG issue tracker is open, and you can send merge requests too! [Less]
|
|
Posted
almost 4 years
ago
Librsvg's documentation tooling is pretty ancient. The man page for
rsvg-convert is written by hand in troff, and the C library's
reference documentation still uses the venerable gtk-doc.
As part of the modernization effort, I have turned the man
... [More]
page into a
reStructuredText document, and the C API documentation into
gi-docgen. This post describes how I did that.
You can read librsvg's new documentation here.
From man to rst
The man page for rsvg-convert was written in troff, which is pretty cumbersome. The following gunk defines a little paragraph and a table:
.P
You can also specify dimensions as CSS lengths, for example
.B 10px
or \"
.BR 8.5in .
The unit specifiers supported are as follows:
.RS
.TS
tab (@);
l lx.
px@T{
pixels (the unit specifier can be omitted)
T}
in@T{
inches
T}
cm@T{
centimeters
T}
mm@T{
millimeters
T}
pt@T{
points, 1/72 inch
T}
pc@T{
picas, 1/6 inch
T}
.TE
Yeah, nope. We have better tools now like rst2man, which take a
reStructuredText document — fancy plain text — and turn it into a
troff man page. I just had to use a command line like
pandoc --from=man --to=rst rsvg-convert.1 > rsvg-convert.rst
and then tweak the output a little:
You can also specify dimensions as CSS lengths, for example ``10px`` or
``8.5in``. The unit specifiers supported are as follows:
== ==========================================
px pixels (the unit specifier can be omitted)
in inches
cm centimeters
mm millimeters
pt points, 1/72 inch
pc picas, 1/6 inch
== ==========================================
Much better, right?
I've learned that Pandoc is awesome. Pure magic, highly recommended.
I hope to integrate the man page into a prettier user manual for
rsvg-convert at some point. It's no longer a trivial program, and its
options allow for some interesting combinations that could use some
illustrations and generally more detail than a man page.
From gtk-doc to gi-docgen
I highly recommend that you read Emmanuele's initial description of
gi-docgen, which includes the history of gtk-doc, a
description of its shortcomings, and how gi-docgen is a simpler tool
that leverages the fact that GObject Introspection already slurps
documentation from source code and so obviates most of gtk-doc
already.
Summary of how gi-docgen works:
The C code has documentation comments in Markdown format, with
annotations for GObject Introspection. (Note: librsvg has no
C code for the library, so those documentation comments actually
live in the .h header files that it installs for the benefit of C
programs.)
The library gets compiled and introspected. In this step,
g-ir-scanner(1) extracts annotations and documentation from the
source code and puts them in the MyLibrary.gir XML file.
You write a small configuration file to tell
gi-docgen about the structure of your documentation. Unlike
gtk-doc, you don't need to write a DocBook skeleton or anything
complicated. Stand-alone chapters can be individual Markdown files,
and the configuration file just lists them in the order you want
them to appear. Gi-docgen automatically includes all the classes,
types, functions, etc. from your code into the docs.
... it runs very fast. Gtk-doc was always slow due to xsltproc and
complicated stylesheets to turn a DocBook document into browsable
HTML documentation. Gi-docgen is much leaner.
Doing the conversion
Unlike the mostly automatic pandoc step for the man page, I
converted the documentation comments to from DocBook to Markdown by
hand. For librsvg this took me a moderately caffeinated afternoon;
it's a little fiddly business, but nothing out of this world.
You can look forward to having good error messages from gi-docgen
when something goes wrong, unlike gtk-doc, whose errors I always
tended to ignore until the last minute because they were so hard to
discern and diagnose.
Some hints:
DocBook hyperlinks that looked like blah
blah get turned into [blah blah](blahblah.html) Markdown.
Gi-docgen allows references to methods like
[[email protected]_child] - see the
linking documentation for other kinds of links.
You can get progressively fancy with introspection
attributes.
There is no direct mapping between DocBook's extremely granular
semantic markup and Markdown conventions, so for example I'd
substitute both foobar and
/foo/bar for `foobar` and `/foo/bar`,
respectively (i.e. the text I wanted to show, between backticks, to
indicate verbatim text).
Librsvg seemed to include verbatim text blocks in gtk-doc delimited like this:
/**
* blah_blah():
*
* For example:
*
* |[
* verbatim text goes here
* ]|
*
* Etc. etc.
*/
Those can go between ``` triple backticks in Markdown:
/**
* blah_blah():
*
* For example:
*
* ```
* verbatim text goes here
* ```
*
* Etc. etc.
*/
Errors I found
My first manual run of gi-docgen looked like this:
$ gi-docgen check Rsvg-2.0.gir
INFO: Loading config file: None
INFO: Search paths: ['/home/federico/src/librsvg/gi-docgen/_build', '/home/federico/.local/share/gir-1.0', '/home/federico/.local/share/flatpak/exports/share/gir-1.0', '/var/lib/flatpak/exports/share/gir-1.0', '/usr/local/share/gir-1.0', '/usr/share/gir-1.0']
INFO: Elapsed time 1.601 seconds
WARNING: Symbol 'Rsvg.HandleFlags' at :0 is not documented
WARNING: Return value for symbol 'Rsvg.Handle.get_dimensions_sub' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.get_geometry_for_element' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.get_geometry_for_layer' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.get_position_sub' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.render_document' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.render_element' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.render_layer' is not documented
WARNING: Return value for symbol 'Rsvg.Handle.set_stylesheet' is not documented
WARNING: Symbol 'Rsvg.Handle.base-uri' at :0 is not documented
WARNING: Symbol 'Rsvg.Handle.dpi-x' at :0 is not documented
WARNING: Symbol 'Rsvg.Handle.dpi-y' at :0 is not documented
WARNING: Symbol 'Rsvg.cleanup' at include/librsvg/rsvg.h:447 is not documented
WARNING: Symbol 'Rsvg.DEPRECATED_FOR' at include/librsvg/rsvg.h:47 is not documented
WARNING: Parameter 'f' of symbol 'Rsvg.DEPRECATED_FOR' is not documented
The warnings like WARNING: Return value ... is not documented are easy
to fix; the comment blocks had their descriptions, but they were
missing the Returns: part.
The warnings like WARNING: Symbol 'Rsvg.Handle.base-uri' at :0 is not documented are different. Those are GObject properties, which previously were documented like this:
/**
* RsvgHandle::base-uri:
*
* Base URI, to be used to resolve relative references for resources. See the section
* "Security and locations of referenced files" for details.
*/
There is a syntax error there! The symbol line should use a single
colon between the class name and the property name,
e.g. RsvgHandle:base-uri instead of RsvgHandle::base-uri. This
one, plus the other properties that showed up as not documented, had
the same kind of typo.
The first warning, WARNING: Symbol 'Rsvg.HandleFlags' at :0
is not documented, turned out to be that there were two
documentation comments with the same title for RsvgHandleFlags, and
the second one was empty — and the last one wins. I left a single one
with the actual docs.
Writing standalone chapters
Librsvg had a few chapters like doc/foo.xml, doc/bar.xml that were
included in the reference documentation; those were a DocBook
each. I was able to convert them to Markdown with
pandoc individually, and then add a Title: heading in the first
line of each .md file — that's what gi-docgen uses to build the
table of contents in the documentation's starting page.
Title: Overview of Librsvg
# Overview of Librsvg
Librsvg is a library for rendering Scalable Vector Graphics files (SVG).
Blah blah blah blah.
Build scripts
There are plenty of examples for using gi-docgen with meson; you can
look at how it is done in gtk.
However, librsvg is still using Autotools! You can steal the following bits:
configure.ac
doc/Makefile.am
Publishing the documentation
Gtk-doc assumed that magic happened somewhere in
developer.gnome.org to generate the documentation and publish it.
Gi-docgen assumes that your project publishes it with Gitlab pages.
Indeed, the new documentation is published there — you can see how
it is generated in .gitlab-ci.yml. Note that there are
two jobs: the reference job generates gi-docgen's HTML in a
public/Rsvg-2.0 directory, and the pages job integrates it with
the Rust API documentation and publishes both together.
Linking the docs to the main developer's site
Finally, librsvg's docs are linked from the GNOME Platform
Introduction. I submitted a merge request to the
developer-www project to update it.
That's all! I hope this is useful for someone who wants to move from
gtk-doc to gi-docgen, which is a much more pleasant tool! [Less]
|
|
Posted
almost 4 years
ago
We’re happy to announce that Linux App Summit will take place in Rovereto, Italy between the 29th and 30th of April.
Linux App Summit (LAS) is a conference focused on building a Linux application ecosystem. LAS aims to encourage the creation of
... [More]
quality applications, seek opportunities for compensation for FOSS developers, and foster a thriving market for the Linux operating system.
This year LAS will be held as a hybrid event and attendees will be able to join virtually or in person at our venue in Rovereto.
Everyone is invited to attend! Companies, journalists, and individuals who are interested in learning more about the Linux desktop application space and growing their user base are especially welcome.
The call for papers and registration will be open soon. Please check linuxappsummit.org for more updates in the upcoming weeks.
About Rovereto
Rovereto Italy is an old Fortress Town located in the autonomous province of Trento in Northen Italy near the southern edge of the Italian Alps and is the main city of the Vallagarina district.
The city has several interesting sites including:
The Ancient War Museum
A castle built by the counts of Castelbarco
The Museum of Modern and Contemporary Art of Trento
The biggest businesses of Rovereto include Wine, coffee, Rubber, and Chocolate. The town was acknowledged as a “Peace town” in the 20th century. Also in the area footprints of dinosaurs have been found.
We hope to see you in Rovereto, Italy.
*The image “Rovereto” by barnyz is licensed under CC BY-NC-ND 2.0.
About the Linux App Summit
The Linux App Summit (LAS) brings the global Linux community together to learn, collaborate, and help grow the Linux application ecosystem. Through talks, panels, and Q&A sessions, we encourage attendees to share ideas, make connections, and join our goal of building a common app ecosystem.
Previous iterations of the Linux App Summit have been held in the United States in Portland, Oregon and Denver, Colorado, as well as in Barcelona, Spain.
Learn more by visiting linuxappsummit.org. [Less]
|
|
Posted
almost 4 years
ago
In the beginning of December I read Andrei Ciobanu’s Writing a simple 16-bit VM in less than 125 lines of C. Now, I’ve been interested in virtual machines and emulators for a long time, and I work tangential to VMs as part of my day job as a
... [More]
JavaScript engine developer for Igalia. I found this post really interesting because, well, it does what it says on the tin: A simple VM in less than 125 lines of C.
Readers of this blog, if I have any left at this point, might remember that in December 2020 I did a series of posts solving that year’s Advent of Code puzzles in order to try to teach myself how to write programs in the Rust programming language. I did say last year that if I were to do these puzzles again, I would do them at a slower pace and wouldn’t blog about them. Indeed, I started again this year, but it just wasn’t as interesting, having already gone through the progression from easy to hard puzzles and now having some idea already of the kinds of twists that they like to do in between the first and second halves of each puzzle.
So, instead of puzzles, I decided to see if I could write a similarly simple VM in Rust this December, as a continuation of my Rust learning exercise last year1.
Andrei Ciobanu’s article, as well as some other articles he cites, write VMs that simulate the LC-3 architecture2. What I liked about this one is that it was really concise and no-nonsense, and did really well at illustrating how a VM works. There are already plenty of other blog posts and GitHub repositories that implement an LC-3 VM in Rust, and I can’t say I didn’t incorporate any ideas from them, but I found many of them to be a bit verbose. I wanted to see if I could create something in the same concise spirit as Andrei’s, but still idiomatic Rust.
Over the course of a couple of weeks during my end-of-year break from work, I think I succeeded somewhat at that, and so I’m going to write a few blog posts about it.
About the virtual machine
This post is not a tutorial about virtual machines. There are already plenty of those, and Andrei’s article is already a great one so it doesn’t make sense for me to duplicate it. Instead, In this section I’ll note some things about the LC-3 architecture before we start.
First of all, it has a very spartan instruction set. Each instruction is 16 bits, and there are no variable length instructions. The opcode is packed in the topmost 4 bits of each word. That means there are at most 16 instructions. And one opcode (1101) is not even used!
Only three instructions are arithmetic-type ones: addition, bitwise AND, and bitwise NOT. If you’re used to x86 assembly language you’ll notice that other operations like subtraction, multiplication, bitwise OR, are missing. We only need these three to do all the other operations in 2’s-complement arithmetic, although it is somewhat tedious! As I started writing some LC-3 assembly language to test the VM, I learned how to implement some other arithmetic operations in terms of ADD, AND, and NOT.3 I’ll go into this in a following post.
The LC-3 does not have a stack. All the operations take place in registers. If you are used to thinking in terms of a stack machine (for example, SpiderMonkey is one), this takes some getting used to.
First steps
I started out by trying to port Andrei’s C code to Rust code in the most straightforward way possible, not worrying about whether it was idiomatic or not.
The first thing I noticed is that whereas in C it’s customary to use a mutable global, such as reserving storage for the VM’s memory and registers at the global level with declarations such as uint16_t mem[UINT16_MAX] = {0};, the Rust compiler makes this very difficult. You can use a mutable static variable, but accessing it needs to be marked as unsafe. In this way, the Rust compiler nudges you to encapsulate the storage inside a class:
struct VM {
mem: [u16; MEM_SIZE],
reg: [u16; NREGS],
running: bool,
}
Next we write functions to access the memory. In the C code these are:
static inline uint16_t mr(uint16_t address) { return mem[address]; }
static inline void mw(uint16_t address, uint16_t val) { mem[address] = val; }
In Rust, we have to cast the address to a usize, since usize is the type that we index arrays with:
#[inline]
fn ld(&mut self, addr: u16) -> u16 {
self.mem[addr as usize]
}
#[inline]
fn st(&mut self, addr: u16, val: u16) {
self.mem[addr as usize] = val;
}
(I decide to name them ld and st for “load” and “store” instead of mr and mw, because the next thing I do is write similar functions for reading and writing the VM’s registers, which I’ll call r and rw for “register” and “register write”. These names look less similar, so I find that makes the code more readable yet still really concise.)
The next thing in the C code is a bunch of macros that do bit-manipulation operations to unpack the instructions. I decide to turn these into #[inline] functions in Rust. For example,
#define OPC(i) ((i)>>12)
#define FIMM(i) ((i>>5)&1)
from the C code, become, in Rust,
#[inline] #[rustfmt::skip] fn opc(i: u16) -> u16 { i >> 12 }
#[inline] #[rustfmt::skip] fn fimm(i: u16) -> bool { (i >> 5) & 1 != 0 }
I put #[rustfmt::skip] because I think it would be nicer if the rustfmt tool would allow you to put super-trivial functions on one line, so that they don’t take up more visual space than they deserve.
You might think that the return type of opc should be an enum. I originally tried making it that way, but Rust doesn’t make it very easy to convert between enums and integers. The num_enum crate provides a way to do this, but I ended up not using it, as you will read below.
We also need a way to load and run programs in LC-3 machine code. I made two methods of VM patterned after the ld_img() and start() functions from the C code.
First I’ll talk about ld_img(). What I really wanted to do is read the bytes of the file directly into the self.mem array, without copying, as the C code does. This is not easy to do in Rust. Whereas in C all pointers are essentially pointers to byte arrays in the end, this is not the case in Rust. It’s surprisingly difficult to express that I want to read u16s into an array of u16s! I finally found a concise solution, using both the byteorder and bytemuck crates. For this to work, you have to import the byteorder::ReadBytesExt trait into scope.
pub fn ld_img(&mut self, fname: &str, offset: u16) -> io::Result<()> {
let mut file = fs::File::open(fname)?;
let nwords = file.metadata()?.len() as usize / 2;
let start = (PC_START + offset) as usize;
file.read_u16_into::(bytemuck::cast_slice_mut(
&mut self.mem[start..(start + nwords)],
))
}
What this does is read u16s, minding the correct byte order, into an array of u8. But we have an array of u16 that we want to store it in, not u8. So bytemuck::cast_slice_mut() treats the &mut [u16] slice as a &mut [u8] slice, essentially equivalent to casting it as (uint8_t*) in C. It does seem like this ought to be part of the Rust standard library, but the only similar facility is std::mem::transmute(), which does the same thing. But it also much more powerful things as well, and is therefore marked unsafe. (I’m trying to avoid having any code that needs to be marked unsafe in this project.)
For running the loaded machine code, I wrote this method:
pub fn start(&mut self, offset: u16) {
self.running = true;
self.rw(RPC, PC_START + offset);
while self.running {
let i = self.ld(self.r(RPC));
self.rw(RPC, self.r(RPC) + 1);
self.exec(i);
}
}
I’ll talk more about what happens in self.exec() in the next section.
The basic execute loop
In the C code, Andrei cleverly builds a table of function pointers and indexes it with the opcode, in order to execute each instruction:
typedef void (*op_ex_f)(uint16_t instruction);
op_ex_f op_ex[NOPS] = {
br, add, ld, st, jsr, and, ldr, str, rti, not, ldi, sti, jmp, res, lea, trap
};
// ...in main loop:
op_ex[OPC(i)](i);
Each function, such as add(), takes the instruction as a parameter, decodes it, and mutates the global state of the VM. In the main loop, at the point where I have self.exec(i) in my code, we have op_ex[OPC(i)](i) which decodes the opcode out of the instruction, indexes the table, and calls the function with the instruction as a parameter. A similar technique is used to execute the trap routines.
This approach of storing function pointers in an array and indexing it by opcode or trap vector is great in C, but is slightly cumbersome in Rust. You would have to do something like this in order to be equivalent to the C code:
type OpExF = fn(&mut VM, u16) -> ();
// in VM:
const OP_EX: [OpExF; NOPS] = [
VM::br, VM::add, ..., VM::trap,
];
// ...in main loop:
OP_EX[opc(i) as usize](self, i);
Incidentally, this is why I decided above not to use an enum for the opcodes. Not only would you have to create it from a u16 when you unpack it from the instruction, you would also have to convert it to a usize in order to index the opcode table.
In Rust, a match expression is a much more natural fit:
match opc(i) {
BR => {
if (self.r(RCND) & fcnd(i) != 0) {
self.rw(RPC, self.r(RPC) + poff9(i));
}
}
ADD => {
self.rw(dr(i), self.r(sr(i)) +
if fimm(i) {
sextimm(i)
} else {
self.r(sr2(i))
});
self.uf(dr(i));
}
// ...etc.
}
However, there is an even better alternative that makes the main loop much more concise, like the one in the C code! We can use the bitmatch crate to simultaneously match against bit patterns and decode parts out of them.
#[bitmatch]
fn exec(&mut self, i: u16) {
#[bitmatch]
match i {
"0000fffooooooooo" /* BR */ => {
if (self.r(RCND) & f != 0) {
self.rw(RPC, self.r(RPC) + sext(o, 9));
}
}
"0001dddsss0??aaa" /* ADD register */ => {
self.rw(d, self.r(s) + self.r(a));
self.uf(d);
}
"0001dddsss1mmmmm" /* ADD immediate */ => {
self.rw(d, self.r(s) + sext(m, 5));
self.uf(d);
}
// ...etc.
}
}
This actually gets rid of the need for all the bit-manipulation functions that I wrote in the beginning, based on the C macros, such as opc(), fimm(), and poff9(), because bitmatch automatically does all the unpacking. The only bit-manipulation we still need to do is sign-extension when we unpack immediate values and offset values from the instructions, as we do above with sext(o, 9) and sext(m, 5).
I was curious what kind of code the bitmatch macros generate under the hood and whether it’s as performant as writing out all the bit-manipulations by hand. For that, I wrote a test program that matches against the same bit patterns as the main VM loop, but with the statements in the match arms just replaced by constants, in order to avoid cluttering the output:
#[bitmatch]
pub fn test(i: u16) -> u16 {
#[bitmatch]
match i {
"0000fffooooooooo" => 0,
"0001dddsss0??aaa" => 1,
"0001dddsss1mmmmm" => 2,
// ...etc.
}
}
There is a handy utility for viewing expanded macros that you can install with cargo install cargo-expand, and then run with cargo expand --lib test (I put the test function in a dummy lib.rs file.)
Here’s what we get!
pub fn test(i: u16) -> u16 {
match i {
bits if bits & 0b1111000000000000 == 0b0000000000000000 => {
let f = (bits & 0b111000000000) >> 9usize;
let o = (bits & 0b111111111) >> 0usize;
0
}
bits if bits & 0b1111000000100000 == 0b0001000000000000 => {
let a = (bits & 0b111) >> 0usize;
let d = (bits & 0b111000000000) >> 9usize;
let s = (bits & 0b111000000) >> 6usize;
1
}
bits if bits & 0b1111000000100000 == 0b0001000000100000 => {
let d = (bits & 0b111000000000) >> 9usize;
let m = (bits & 0b11111) >> 0usize;
let s = (bits & 0b111000000) >> 6usize;
2
}
// ...etc.
_ => // ...some panicking code
}
}
It’s looking a lot like what I’d written anyway, but with all the bit-manipulation functions inlined. The main disadvantage is that you have to AND the value with a bitmask at each arm of the match expression. But maybe that isn’t such a problem? Let’s look at the generated assembly to see what the computer actually executes. There is another Cargo tool for this, which you can install with cargo install cargo-asm and run with cargo asm --lib lc3::test4. In the result, there are actually only three AND instructions, because there are only three unique bitmasks tested among all the arms of the match expression (0b1111000000000000, 0b1111100000000000, and 0b1111000000100000). So it seems like the compiler is quite able to optimize this into something good.
First test runs
By the time I had implemented all the instructions except for TRAP, at this point I wanted to actually run a program on the LC-3 VM! Andrei has one program directly in his blog post, and another one in his GitHub repository, so those seemed easiest to start with.
Just like in the blog post, I wrote a program (in my examples/ directory so that it could be run with cargo r --example) to output the LC-3 machine code. It looked something like this:
let program: [u16; 7] = [
0xf026, // TRAP 0x26
0x1220, // ADD R1, R0, 0
0xf026, // TRAP 0x26
0x1240, // ADD R1, R1, R0
0x1060, // ADD R0, R1, 0
0xf027, // TRAP 0x27
0xf025, // TRAP 0x25
];
let mut file = fs::File::create(fname)?;
for inst in program {
file.write_u16::(inst)?;
}
Ok(())
In order for this to work, I still needed to implement some of the TRAP routines. I had left those for last, and at that point my match expression for TRAP instructions looked like "1111????tttttttt" => self.trap(t), and my trap() method looked like this:
fn trap(&mut self, t: u8) {
match t {
_ => self.crash(&format!("Invalid TRAP vector {:#02x}", t)),
}
}
For this program, we can see that three traps need to be implemented: 0x25 (HALT), 0x26 (INU16), and 0x27 (OUTU16). So I was able to add just three arms to my match expression:
0x25 => self.running = false,
0x26 => {
let mut input = String::new();
io::stdin().read_line(&mut input).unwrap_or(0);
self.rw(0, input.trim().parse().unwrap_or(0));
}
0x27 => println!("{}", self.r(0)),
With this, I could run the sample program, type in two numbers, and print out their sum.
The second program sums an array of numbers. In this program, I added a TRAP 0x27 instruction right before the HALT in order to print out the answer, otherwise I couldn’t see if it was working! This also required changing R1 to R0 so that the sum is in R0 when we call the trap routine, and adjusting the offset in the LEA instruction.5
When I tried running this program, it crashed the VM! This is due to the instruction ADD R4, R4, x-1 which decrements R4 by adding -1 to it. R4 is the counter for how many array elements we have left to process, so initially it holds 10, and when we get to that instruction for the first time, we decrement it to 9. But if you look at the implementation of the ADD instruction that I wrote above, we are actually doing an unsigned addition of the lowest 5 bits of the instruction, sign-extended to 16 bits, so we are not literally decrementing it. We are adding 0xffff to 0x000a and expecting it to wrap around to 0x0009 like it does in C. But integer arithmetic doesn’t wrap in Rust!
Unless you specifically tell it to, that is. So we could use u16::wrapping_add() instead of the + operator to do the addition. But I got what I thought was a better idea, to use std::num::Wrapping! I rewrote the definition of VM at the top of the file:
type Word = Wrapping;
struct VM {
mem: [Word; MEM_SIZE],
reg: [Word; NREGS],
running: bool,
}
This did require adding Wrapping() around some integer literals and adding .0 to unwrap to the bare u16 in some places, but on the whole it made the code more concise and readable. As an added bonus, this way, we are using the type system to express that the LC-3 processor does wrapping unsigned arithmetic. (I do wish that there were a nicer way to express literals of the Wrapping type though.)
And with that, the second example program works. It outputs 16, as expected. In a following post I’ll go on to explain some of the other test programs that I wrote.
Other niceties
At this point I decided to do some refactoring to make the Rust code more readable and hopefully more idiomatic as well. Inspired by the type alias for Word, I added several more, one for addresses and one for instructions, as well as a function to convert a word to an address:
type Addr = usize;
type Inst = u16;
type Reg = u16;
type Flag = Word;
type TrapVect = u8;
#[inline] fn adr(w: Word) -> Addr { w.0.into() }
Addr is convenient to alias to usize because that’s the type that we use to index the memory array. And Inst is convenient to alias to u16 because that’s what bitmatch works with.
In fact, using types in this way actually allowed me to catch two bugs in Andrei’s original C code, where the index of the register was used instead of the value contained in the register.
I also added a few convenience methods to VM for manipulating the program counter:
#[inline] fn pc(&self) -> Word { self.r(RPC) }
#[inline] fn jmp(&mut self, pc: Word) { self.rw(RPC, pc); }
#[inline] fn jrel(&mut self, offset: Word) { self.jmp(self.pc() + offset); }
#[inline] fn inc_pc(&mut self) { self.jrel(Wrapping(1)); }
With these, I could write a bunch of things to be more expressive and concise. For example, the start() method now looked like this:
pub fn start(&mut self, offset: u16) {
self.running = true;
self.jmp(PC_START + Wrapping(offset));
while self.running {
let i = self.ld(adr(self.pc())).0;
self.inc_pc();
self.exec(i);
}
Ok(())
}
I also added an iaddr() method to load an address indirectly from another address given as an offset relative to the program counter, to simplify the implementation of the LDI and STI instructions.
While I was at it, I noticed that the uf() (update flags) method always followed a store into a destination register, and I decided to rewrite it as one method dst(), which stores a value into a destination register and updates the flags register based on that value:
#[inline]
fn dst(&mut self, r: Reg, val: Word) {
self.rw(r, val);
self.rw(
RCND,
match val.0 {
0 => FZ,
1..=0x7fff => FP,
0x8000..=0xffff => FN,
},
);
}
At this point, the VM’s main loop looked just about as concise and simple as the original C code did! The original:
static inline void br(uint16_t i) { if (reg[RCND] & FCND(i)) { reg[RPC] += POFF9(i); } }
static inline void add(uint16_t i) { reg[DR(i)] = reg[SR1(i)] + (FIMM(i) ? SEXTIMM(i) : reg[SR2(i)]); uf(DR(i)); }
static inline void ld(uint16_t i) { reg[DR(i)] = mr(reg[RPC] + POFF9(i)); uf(DR(i)); }
static inline void st(uint16_t i) { mw(reg[RPC] + POFF9(i), reg[DR(i)]); }
static inline void jsr(uint16_t i) { reg[R7] = reg[RPC]; reg[RPC] = (FL(i)) ? reg[RPC] + POFF11(i) : reg[BR(i)]; }
static inline void and(uint16_t i) { reg[DR(i)] = reg[SR1(i)] & (FIMM(i) ? SEXTIMM(i) : reg[SR2(i)]); uf(DR(i)); }
static inline void ldr(uint16_t i) { reg[DR(i)] = mr(reg[SR1(i)] + POFF(i)); uf(DR(i)); }
static inline void str(uint16_t i) { mw(reg[SR1(i)] + POFF(i), reg[DR(i)]); }
static inline void res(uint16_t i) {} // unused
static inline void not(uint16_t i) { reg[DR(i)]=~reg[SR1(i)]; uf(DR(i)); }
static inline void ldi(uint16_t i) { reg[DR(i)] = mr(mr(reg[RPC]+POFF9(i))); uf(DR(i)); }
static inline void sti(uint16_t i) { mw(mr(reg[RPC] + POFF9(i)), reg[DR(i)]); }
static inline void jmp(uint16_t i) { reg[RPC] = reg[BR(i)]; }
static inline void rti(uint16_t i) {} // unused
static inline void lea(uint16_t i) { reg[DR(i)] =reg[RPC] + POFF9(i); uf(DR(i)); }
static inline void trap(uint16_t i) { trp_ex[TRP(i)-trp_offset](); }
My implementation:
#[bitmatch]
match i {
"0000fffooooooooo" /* BR */ => if (self.r(RCND).0 & f) != 0 { self.jrel(sext(o, 9)); },
"0001dddsss0??aaa" /* ADD1 */ => self.dst(d, self.r(s) + self.r(a)),
"0001dddsss1mmmmm" /* ADD2 */ => self.dst(d, self.r(s) + sext(m, 5)),
"0010dddooooooooo" /* LD */ => self.dst(d, self.ld(adr(self.pc() + sext(o, 9)))),
"0011sssooooooooo" /* ST */ => self.st(adr(self.pc() + sext(o, 9)), self.r(s)),
"01000??bbb??????" /* JSRR */ => { self.rw(R7, self.pc()); self.jmp(self.r(b)); }
"01001ooooooooooo" /* JSR */ => { self.rw(R7, self.pc()); self.jrel(sext(o, 11)); }
"0101dddsss0??aaa" /* AND1 */ => self.dst(d, self.r(s) & self.r(a)),
"0101dddsss1mmmmm" /* AND2 */ => self.dst(d, self.r(s) & sext(m, 5)),
"0110dddbbboooooo" /* LDR */ => self.dst(d, self.ld(adr(self.r(b) + sext(o, 6)))),
"0111sssbbboooooo" /* STR */ => self.st(adr(self.r(b) + sext(o, 6)), self.r(s)),
"1000????????????" /* n/a */ => self.crash(&format!("Illegal instruction {:#04x}", i)),
"1001dddsss??????" /* NOT */ => self.dst(d, !self.r(s)),
"1010dddooooooooo" /* LDI */ => self.dst(d, self.ld(self.iaddr(o))),
"1011sssooooooooo" /* STI */ => self.st(self.iaddr(o), self.r(s)),
"1100???bbb??????" /* JMP */ => self.jmp(self.r(b)),
"1101????????????" /* RTI */ => self.crash("RTI not available in user mode"),
"1110dddooooooooo" /* LEA */ => self.dst(d, self.pc() + sext(o, 9)),
"1111????tttttttt" /* TRAP */ => self.trap(t as u8),
}
Of course, you could legitimately complain that both are horrible soups of one- and two-letter identifiers.6 But I think in both of them, if you have the abbreviations close to hand (and you do, since the program is so small!) it’s actually easier to follow because everything fits well within one vertical screenful of text. The Rust version has the added bonus of the bitmatch patterns being very visual, and the reader not having to think about bit shifting in their head.
Here’s the key for abbreviations:
i — Instruction
r — Register
d — Destination register (“DR” in the LC-3 specification)
s — Source register (“SR1”)
a — Additional source register (“SR2”)
b — Base register (“BASER”)
m — iMmediate value
o — Offset (6, 9, or 11 bits)
f — Flags
t — Trap vector7
pc — Program Counter
st — STore in memory
ld — LoaD from memory
rw — Register Write
adr — convert machine word to memory ADdRess
jmp — JuMP
dst — Destination register STore and update flags
jrel — Jump RELative to the PC
sext — Sign EXTend
iaddr — load Indirect ADDRess
At this point the only thing left to do, to get the program to the equivalent level of functionality as the one in Andrei’s blog post, was to implement the rest of the trap routines.
Two of the trap routines involve waiting for a key press. This is actually surprisingly difficult in Rust, as far as I can tell, and definitely not as straightforward as the reg[R0] = getchar(); which you can do in C. You can use the libc crate, but libc::getchar() is marked unsafe.8 Instead, I ended up pulling in another dependency, the console crate, and adding a term: console::Term member to VM. With that, I could implement a getc() method that reads a character, stores its ASCII code in the R0 register, and returns the character itself:
fn getc(&mut self) -> char {
let ch = self.term.read_char().unwrap_or('\0');
let res = Wrapping(if ch.is_ascii() { ch as u16 & 0xff } else { 0 });
self.rw(R0, res);
ch
}
This by itself was enough to implement the GETC trap routine (which waits for a key press and stores its ASCII code in the lower 8 bits of R0) and the IN trap routine (which does the same thing but first prints a prompt, and echoes the character back to stdout) was not much more complicated:
0x20 => {
self.getc();
}
0x23 => {
print!("> ");
let ch = self.getc();
print!("{}", ch);
}
Next I wrote the OUT trap routine, which prints the lower 8 bits of R0 as an ASCII character. I wrote an ascii() function that converts the lower 8 bits of a machine word into a char:
#[inline]
fn ascii(val: Word) -> char {
char::from_u32(val.0 as u32 & 0xff).unwrap_or('?')
}
// In the TRAP match expression:
0x21 => print!("{}", ascii(self.r(R0))),
Now the two remaining traps were PUTS (print a zero-terminated string starting at the address in R0) and PUTSP (same, but the string is packed two bytes per machine word). These two routines are very similar in that they both access a variable-length area of memory, starting at the address in R0 and ending with the next memory location that contains zero. I found a nice solution that feels very Rust-like to me, a strz_words() method that returns an iterator over exactly this region of memory:
fn strz_words(&self) -> impl Iterator {
self.mem[adr(self.r(R0))..].iter().take_while(|&v| v.0 != 0)
}
The two trap routines differ in what they do with the items coming out of this iterator. For PUTS we convert each machine word to a char with ascii():
0x22 => print!("{}", self.strz_words().map(|&v| ascii(v)).collect::()),
(It’s too bad that we have a borrowed value in the closure, otherwise we could just do map(ascii). On the positive side, collect::() is really nice.)
PUTSP is a bit more complicated. It’s a neat trick to use flat_map() to convert our iterator over machine words into a twice-as-long iterator over bytes. However, we still have to collect the bytes into an intermediate vector so we can check if the last byte is zero, because the string might have an odd number of bytes. In that case we’d still have a final zero byte which we have to pop off the end, because the iterator doesn’t finish until we get a whole memory location that is zero.
0x24 => {
let mut bytes = self
.strz_words()
.flat_map(|&v| v.0.to_ne_bytes())
.collect::>();
if bytes[bytes.len() - 1] == 0 {
bytes.pop();
};
print!("{}", String::from_utf8_lossy(&bytes));
}
Conclusion
At this point, what I had was mostly equivalent to what you have if you follow along with Andrei’s blog post until the end, so I’ll end this post here. Unlike the C version, it is not under 125 lines long, but it does clock in at just under 200 lines.9
After I had gotten this far, I spent some time improving what I had, making the VM a bit fancier and writing some tools to use with it. I intend to make this article into a series, and I’ll cover these improvements in following posts, starting with an assembler.
You can find the code in a GitHub repo. I have kept this first version apart, in an examples/first.rs file, but you can browse the other files in that repository if you want a sneak preview of some of the other things I’ll write about.
Many thanks to Federico Mena Quintero who gave some feedback on a draft of this post.
[1] In the intervening time, I didn’t write any Rust code at all
[2] “LC” stands for “Little Computer”. It’s a computer architecture described in a popular textbook, and as such shows up in many university courses on low-level programming. I hadn’t heard of it before, but after all, I didn’t study computer science in university
[3] I won’t lie, mostly by googling terms like “lc3 division” on the lazyweb
[4] This tool is actually really bad at finding functions unless they’re in particular locations that it expects, such as lib.rs, which is the real reason why I stuck the test function in a dummy lib.rs file
[5] Note that the offset is calculated relative to the following instruction. I assume this is so that branching to an offset of 0 takes you to the next instruction instead of looping back to the current one
[6] Especially since bitmatch makes you use one-letter variable names when unpacking
[7] i.e., memory address of the trap routine. I have no idea why the LC-3 specification calls it a vector, since it is in fact a scalar
[8] I’m not sure why. Federico tells me it is possibly because it doesn’t acquire a lock on stdin.
[9] Not counting comments, and provided we cheat a bit by sticking a few trivial functions onto one line with #[rustfmt::skip] [Less]
|
|
Posted
almost 4 years
ago
by
[email protected] (Jakub Steiner)
It’s been exactly one year since I’ve done the foolish thing and changed my blog backend to write more. And to my own surprise it worked. Let me look back at 2021 from a rather narrow perspective of what I usually write about. Perhaps to your
... [More]
disappointment most of it is personal, not professional.
I’ve produced a fraction of my drone videos from the past years in 2021 and haven’t practiced or raced nearly at all this year. This void has been fully filled by music and synthesizers. After two decades of hiatus I enjoy making music again. Fully aware how crude and awful I am at it, there isn’t any other medium where I enjoy my own creations as much as music.
I’ve also come back to pixel art, even though the joy is a lot tainted by the tools I use. Very convenient, very direct, so much fun, very proprietary.
In 2022 I’d like to
Replace my reliance on iPad and Apple Pencil. Would be nice to use a small screen tablet on my Fedora instead. Just plug it when I need it, run GIMP or Aseprite in the same time it takes me with Procreate and Pixaki.
Embrace Fedora for music making. While I’m not a heavy Ableton Live user, I should totally embrace Bitwig instead as it’s conveniently available as a Flatpak. The Pipewire revolution also made Renoise usable for me again, so maybe I’ll give it another stab.
Continue using the gear I have and not buy any more. I have way more gear than I need. I’m going to sell some I don’t actually enjoy using anymore, but even splitting time between the Digitone, Digitakt, Polyend Tracker and Dirtywave M8 is making me feel unfocused. If I only had a synth room where I could just walk in and jam :)
Continue posting on this ancient platform called WWW. Before I figure out a replacement for comments, feel free to tweet at me or toot.
A little late with wishing you a better 2022 than 2020 was! I didn’t even catch 2021 fly by. [Less]
|
|
Posted
almost 4 years
ago
This release adds support for scikit-learn 1.0, which includes support for feature names. If you pass a pandas dataframe to fit, the estimator will set a feature_names_in_ attribute containing the feature names. When a dataframe is passed to predict
... [More]
, it is checked that the column names are consistent with those passed to fit.
The
example below
illustrates this feature.
For a full list of changes in scikit-survival 0.17.0, please see the
release notes.
Installation
Pre-built conda packages are available for Linux, macOS, and Windows via
conda install -c sebp scikit-survival
Alternatively, scikit-survival can be installed from source following
these instructions.
Feature Names Support
Prior to scikit-survival 0.17, you could pass a pandas dataframe to estimators’ fit
and predict methods, but the estimator was oblivious to the
feature names accessible via the dataframe’s columns attribute.
With scikit-survival 0.17, and thanks to scikit-learn 1.0,
feature names will be considered when a dataframe is passed.
Let’s illustrate feature names support using the Veteran’s Lung Cancer dataset.
from sksurv.datasets import load_veterans_lung_cancer
X, y = load_veterans_lung_cancer()
X.head(3)
Age_in_years
Celltype
Karnofsky_score
Months_from_Diagnosis
Prior_therapy
Treatment
0
69.0
squamous
60.0
7.0
no
standard
1
64.0
squamous
70.0
5.0
yes
standard
2
38.0
squamous
60.0
3.0
no
standard
The original data has 6 features, three of which contain strings, which
we encode as numeric using OneHotEncoder.
from sksurv.preprocessing import OneHotEncoder
transform = OneHotEncoder()
Xt = transform.fit_transform(X)
Transforms now have a get_feature_names_out() method, which will
return the name of features after the transformation.
transform.get_feature_names_out()
array(['Age_in_years', 'Celltype=large', 'Celltype=smallcell',
'Celltype=squamous', 'Karnofsky_score', 'Months_from_Diagnosis',
'Prior_therapy=yes', 'Treatment=test'], dtype=object)
The transformed data returned by OneHotEncoder is again a dataframe,
which can be used to fit
Cox’s proportional hazards model.
from sksurv.linear_model import CoxPHSurvivalAnalysis
model = CoxPHSurvivalAnalysis().fit(Xt, y)
Since we passed a dataframe, the feature_names_in_ attribute will contain
the names of the dataframe used when calling fit.
model.feature_names_in_
array(['Age_in_years', 'Celltype=large', 'Celltype=smallcell',
'Celltype=squamous', 'Karnofsky_score', 'Months_from_Diagnosis',
'Prior_therapy=yes', 'Treatment=test'], dtype=object)
This is used during prediction to check that the data matches the format
of the training data. For instance, when passing a raw numpy array instead
of a dataframe, a warning will be issued.
pred = model.predict(Xt.values)
UserWarning: X does not have valid feature names, but CoxPHSurvivalAnalysis was fitted with feature names
Moreover, it will also check that the order of columns matches.
X_reordered = pd.concat(
(Xt.drop("Age_in_years", axis=1), Xt.loc[:, "Age_in_years"]),
axis=1
)
pred = model.predict(X_reordered)
FutureWarning: The feature names should match those that were passed during fit. Starting version 1.2, an error will be raised.
Feature names must be in the same order as they were in fit.
For more details on feature names support, have a look at the
scikit-learn release highlights. [Less]
|