Skip to content

Makes Dumper (and stringify) faster

Marek Szuba requested to merge experimental/sparseseen_data_dumper into master

Created by: muffato

Use case

Describe the problem. Please provide an example representing the motivation behind the need for having these changes in place.

This is from the Perl documentation:

$Data::Dumper::Sparseseen or $OBJ->Sparseseen([NEWVAL])

By default, Data::Dumper builds up the "seen" hash of scalars that it has encountered during serialization. This is very expensive. This seen hash is necessary to support and even just detect circular references. It is exposed to the user via the "Seen()" call both for writing and reading.

If you, as a user, do not need explicit access to the "seen" hash, then you can set the "Sparseseen" option to allow Data::Dumper to eschew building the "seen" hash for scalars that are known not to possess more than one reference. This speeds up serialization considerably if you use the XS implementation.

Note: If you turn on "Sparseseen", then you must not rely on the content of the seen hash since its contents will be an implementation detail!

TBH, I don't know exactly how faster it is, but stringify is called so many times I can't see any reasons why not to enable this option.

Description

Using one or more sentences, describe the proposed changes and how they are implemented.

Like the other Data::Dumper options, I've enabled it locally

Possible Drawbacks

If applicable, describe any possible undesirable consequence of the changes.

External code may be calling our stringify while relying on Data::Dumper's "Seen" dictionary. I think they could just call Data::Dumper directly instead

Testing

Have you added/modified unit tests to test the changes?

No: stringify is already used in so many places

If so, do the tests pass/fail?

N/A

Have you run the entire test suite and no regression was detected?

Yes. All good

Merge request reports