What's new in fast-check 3.23.0?
The previous release introduced the unit concept to the string arbitraries, enhancing control over string generation. In version 3.23.0, we’ve extended this feature to even more arbitraries, broadening its applicability. This update also includes several performance optimizations to make your testing faster and more efficient.
Continue reading to explore the detailed updates it brings.
Polyglot string to the next level
Polyglot string was introduced in version 3.22.0 of fast-check, allowing for more diverse string generation. In this release, we've extended the unit option to several additional arbitraries:
anythingjsonobject
With this change, we recommend using the new stringUnit constraint to specify the type of characters you'd like in generated JSONs, objects, and other structures. The new syntax is as follows:
fc.json({ stringUnit: 'grapheme' });
Dichotomic search in mapToConstant
The mapToConstant arbitrary provides an easy-to-use API for generating values based on an index. Our documentation includes an example that shows how to create an arbitrary to generate characters from a to z and numbers from 0 to 9:
fc.mapToConstant(
{ num: 26, build: (v) => String.fromCharCode(v + 0x61) },
{ num: 10, build: (v) => String.fromCharCode(v + 0x30) },
);
// Examples of generated values: "6", "8", "d", "9", "r"…
This arbitrary is actually more central to the library than it might seem at first glance. It’s a core building block backing the string arbitrary and playing a crucial role in the codebase. Given its importance, maximizing its efficiency is essential.
When benchmarking different variations of string with various unit values, we noticed a significant performance disparity: the grapheme unit was significantly slower than the grapheme-ascii unit. We traced this slowdown to a linear search within the mapToConstant implementation, where each call to generate involved looking up the right build function. Initially, our code looked something like this:
function buildForChoiceIndex(choiceIndex: number) {
let idx = -1;
let numSkips = 0;
while (choiceIndex >= numSkips) {
numSkips += entries[++idx].num;
}
return entries[idx].build(choiceIndex - numSkips + entries[idx].num);
}
Unfortunately, this approach has poor performance due to its linear complexity. For small entries arrays (like in our example with just two entries), the impact is minimal. However, for larger entries arrays (as with grapheme), this linear complexity results in significant slowdowns, as each lookup’s average time increases in proportion to the number of entries.
To improve this, we introduced a new start field to each entry (only internally), marking the starting index for each range. Applied to our example, the entries now look like this:
const entries = [
{ start: 0, num: 26, build: (v) => String.fromCharCode(v + 0x61) },
{ start: 26, num: 10, build: (v) => String.fromCharCode(v + 0x30) },
];
With this additional information, we can perform a dichotomic search to locate the correct build function. Instead of iterating through each entry, we can efficiently narrow down the search range, achieving logarithmic complexity. Here’s the updated implementation:
function buildForChoiceIndex(choiceIndex: number) {
let min = 0;
let max = entries.length;
while (max - min > 1) {
const mid = Math.floor((min + max) / 2);
if (choiceIndex < entries[mid].from) {
max = mid;
} else {
min = mid;
}
}
return entries[min].build(choiceIndex - entries[min].start + entries[min].num);
}
This optimization yielded a performance improvement from 1,800 operations per second to 3,700 operations per second for the following code:
fc.assert(fc.property(fc.string({ unit: 'grapheme' }), (s) => true));
While 1,800 operations per second may already sound substantial, the impact becomes more noticeable in real-world applications where multiple strings are generated in combination, such as arrays of strings or objects with string properties. This improvement can make a tangible difference in CI performance.
For more details on this change, take a look at the PR#5386. We also applied a similar optimization to reduce the cost of canShrinkWithoutContext on constant, where we replaced a linear search with direct access, as described in PR#5372.
Cached computations for faster instantiation
Our next set of performance optimizations focused on reducing the cost of instantiating the string arbitrary. In version 3.22.0, this cost was still relatively high — and for good reason.
By default, the string arbitrary in fast-check can generate values such as __proto__ and key, among others. To achieve this, we need to confirm that these specific values are valid outputs of the generator. This requires checking if __proto__ or other reserved values could actually be produced by string({ unit }), where unit might be something like constantFrom('a', 'b', 'c') or even more complex. This validation step is computationally intensive.
To improve efficiency, we realized that for a given unit, the set of acceptable strings remains consistent across calls to string({ unit }). Based on this, we implemented a caching mechanism backed by a WeakMap, with the arbitrary linked to the specified unit used as the cache key.
This change led to a dramatic improvement, boosting instantiation performance from 20,000 instances per second to 260,000 instances per second.
These optimizations are detailed in PR#5387, PR#5388 and PR#5389.
Changelog since 3.22.0
The version 3.23.0 is based on version 3.22.0.
Features
- (PR#5366) Add support for string-
unitonobject/anythingarbitrary - (PR#5367) Add support for string-
unitonjsonarbitrary - (PR#5390) Add back strong unmapping capabilities to
string
Fixes
- (PR#5327) Bug: Resist even more to external poisoning for
string - (PR#5368) Bug: Better support for poisoning on
stringMatching - (PR#5344) CI: Adapt some tests for Node v23
- (PR#5346) CI: Drop usages of
it.concurrentdue to Node 23 failing - (PR#5363) CI: Move to Vitest for
examples/ - (PR#5391) CI: Preview builds using
pkg.pr.new - (PR#5392) CI: Connect custom templates to
pkg.pr.newpreviews - (PR#5394) CI: Install dependencies before building changesets
- (PR#5396) CI: Proper commit name on changelogs
- (PR#5393) Clean: Drop unused
examples/jest.setup.js - (PR#5249) Doc: Release note for fast-check 3.22.0
- (PR#5369) Doc: Typo fix in model-based-testing.md
- (PR#5370) Doc: Add new contributor jamesbvaughan
- (PR#5383) Doc: Properly indent code snippets for the documentation
- (PR#5372) Performance: Faster
canShrinkWithoutContextfor constants - (PR#5386) Performance: Faster generate process for
mapToConstant - (PR#5387) Performance: Faster tokenizer of strings
- (PR#5388) Performance: Faster initialization of
stringwith faster slices - (PR#5389) Performance: Faster initialization of
stringwith pre-cached slices - (PR#5371) Test: Add extra set of tests for
constant*