XmlExtractKit for PHP: Stream large XML, extract only what matters, and get plain PHP arrays.
large XML → selected nodes → plain PHP arrays
composer require sbwerewolf/xml-navigator
For local test and coverage dependencies on a standard PHP 8.4 setup,
see tests/ENVIRONMENT.md.
XmlExtractKit is built for the boring XML jobs that show up in real systems:
XMLReader logic.Use it for feeds, partner exports, imports, SOAP-ish payloads, marketplace catalogs, ETL pipelines, and other legacy XML integrations.
Open large XML with XMLReader, select matching nodes, receive plain
PHP arrays.
use SbWereWolf\XmlNavigator\Parsing\FastXmlParser;
require_once __DIR__ . '/vendor/autoload.php';
$uri = tempnam(sys_get_temp_dir(), 'xml-extract-kit-');
file_put_contents(
$uri,
<<<XML
<?xml version="1.0" encoding="UTF-8"?>
<catalog generated_at="2026-04-05T10:00:00Z">
<offer id="1001" available="true">
<name>Keyboard</name>
<price currency="USD">49.90</price>
</offer>
<service id="s-1">
<name>Warranty</name>
</service>
<offer id="1002" available="false">
<name>Mouse</name>
<price currency="USD">19.90</price>
</offer>
</catalog>
XML
);
$reader = XMLReader::open($uri);
foreach (
FastXmlParser::extractHierarchy(
$reader,
static fn(XMLReader $cursor): bool =>
$cursor->nodeType === XMLReader::ELEMENT
&& $cursor->name === 'offer'
) as $offer
) {
var_export($offer);
echo PHP_EOL;
}
$reader->close();
unlink($uri);
Output:
array (
'n' => 'offer',
'a' =>
array (
'id' => '1001',
'available' => 'true',
),
's' =>
array (
0 =>
array (
'n' => 'name',
'v' => 'Keyboard',
),
1 =>
array (
'n' => 'price',
'v' => '49.90',
'a' =>
array (
'currency' => 'USD',
),
),
),
)
array (
'n' => 'offer',
'a' =>
array (
'id' => '1002',
'available' => 'false',
),
's' =>
array (
0 =>
array (
'n' => 'name',
'v' => 'Mouse',
),
1 =>
array (
'n' => 'price',
'v' => '19.90',
'a' =>
array (
'currency' => 'USD',
),
),
),
)
XmlElementUse XmlConverter when your project already has its own internal
array contract and you want hierarchy output with your own key names.
use SbWereWolf\XmlNavigator\Conversion\XmlConverter;
require_once __DIR__ . '/vendor/autoload.php';
$converter = new XmlConverter(
val: 'value',
attr: 'attributes',
name: 'name',
seq: 'children',
);
$hierarchy = $converter->toHierarchyOfElements(
'<price currency="USD">129.90</price>'
);
var_export($hierarchy);
Output:
array (
'name' => 'price',
'value' => '129.90',
'attributes' =>
array (
'currency' => 'USD',
),
)
Use FastXmlParser on top of XMLReader when the file is large and
only some nodes matter.
use SbWereWolf\XmlNavigator\Parsing\FastXmlParser;
require_once __DIR__ . '/vendor/autoload.php';
$uri = tempnam(sys_get_temp_dir(), 'xml-extract-kit-');
file_put_contents(
$uri,
<<<'XML'
<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<offer id="1001">
<name>Keyboard</name>
<price>49.90</price>
</offer>
<service id="s-1">
<name>Warranty</name>
</service>
<offer id="1002">
<name>Mouse</name>
<price>19.90</price>
</offer>
</catalog>
XML
);
$reader = XMLReader::open($uri);
$offers = FastXmlParser::extractHierarchy(
$reader,
static fn(XMLReader $cursor):
bool => $cursor->nodeType === XMLReader::ELEMENT
&& $cursor->name === 'offer'
);
$reader->close();
unlink($uri);
foreach ($offers as $offer) {
var_export($offer);
echo PHP_EOL;
}
XmlElementUse FastXmlToArray::convert() when you want a stable normalized
structure, then wrap it with XmlElement for convenient traversal.
use SbWereWolf\XmlNavigator\Conversion\FastXmlToArray;
use SbWereWolf\XmlNavigator\Navigation\XmlElement;
require_once __DIR__ . '/vendor/autoload.php';
$xml = <<<'XML'
<catalog region="eu">
<offer id="1001" available="true">
<name>Keyboard</name>
<tag>office</tag>
<tag>usb</tag>
</offer>
</catalog>
XML;
$root = new XmlElement(FastXmlToArray::convert($xml));
$offer = $root->pull('offer')->current();
echo $root->name() . PHP_EOL; // catalog
echo $root->get('region') . PHP_EOL; // eu
echo ($root->hasElement('offer') ? 'yes' : 'no') . PHP_EOL; // yes
echo PHP_EOL;
echo 'offer attributes:' . PHP_EOL;
foreach ($offer->attributes() as $attribute) {
echo $attribute->name() . '=' . $attribute->value() . PHP_EOL;
}
echo PHP_EOL;
echo 'offer elements with name `tag`:' . PHP_EOL;
$tagValues = array_map(
static fn (XmlElement $tag): string => $tag->value(),
$offer->elements('tag')
);
var_export($tagValues);
Output:
catalog
eu
yes
offer attributes:
id=1001
available=true
value of offer elements with name `tag`:
array (
0 => 'office',
1 => 'usb',
)
$xmlText or $xmlUri,
but not both;XMLReader, use the streaming API first
instead of loading the entire document.The detailed method-by-method documentation stays available in dedicated files:
Standalone runnable snippets are also included in
examples/.
XmlExtractKit is not trying to be:
The value proposition is much simpler:
stream XML, extract only what matters, and keep working with plain arrays.
| Need | Start here |
|---|---|
| I need plain arrays from XML now | FastXmlToArray::prettyPrint() |
| I need a stable normalized structure for traversal | FastXmlToArray::convert() |
| I need to stream only matching elements from large XML | FastXmlParser::extractPrettyPrint() |
| I need streaming plus normalized output | FastXmlParser::extractHierarchy() |
| I need custom key names | XmlConverter or XmlParser |
| I need low-level composition around an existing cursor | PrettyPrintComposer or HierarchyComposer |
Nicholas Volkhin
e-mail ulfnew@gmail.com
phone +7-902-272-65-35
Telegram @sbwerewolf