Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
221 views
in Technique[技术] by (71.8m points)

regex - How to use RegexIterator in PHP

I have yet to find a good example of how to use the php RegexIterator to recursively traverse a directory.

The end result would be I want to specify a directory and find all files in it with some given extensions. Say for example only html/php extensions. Furthermore, I want to filter out folders such of the type .Trash-0, .Trash-500 etc.

<?php 
$Directory = new RecursiveDirectoryIterator("/var/www/dev/");
$It = new RecursiveIteratorIterator($Directory);
$Regex = new RegexIterator($It,'/^.+.php$/i',RecursiveRegexIterator::GET_MATCH);

foreach($Regex as $v){
    echo $value."<br/>";
}
?>

Is what I have so far but result in : Fatal error: Uncaught exception 'UnexpectedValueException' with message 'RecursiveDirectoryIterator::__construct(/media/hdmovies1/.Trash-0)

Any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

There are a couple of different ways of going about something like this, I'll give two quick approaches for you to choose from: quick and dirty, versus longer and less dirty (though, it's a Friday night so we're allowed to go a little bit crazy).

1. Quick (and dirty)

This involves just writing a regular expression (could be split into multiple) to use to filter the collection of files in one quick swoop.

(Only the two commented lines are really important to the concept.)

$directory = new RecursiveDirectoryIterator(__DIR__);
$flattened = new RecursiveIteratorIterator($directory);

// Make sure the path does not contain "/.Trash*" folders and ends eith a .php or .html file
$files = new RegexIterator($flattened, '#^(?:[A-Z]:)?(?:/(?!.Trash)[^/]+)+/[^/]+.(?:php|html)$#Di');

foreach($files as $file) {
    echo $file . PHP_EOL;
}

This approach has a number of issues, though it is quick to implement being just a one-liner (though the regex might be a pain to decipher).

2. Less quick (and less dirty)

A more re-usable approach is to create a couple of bespoke filters (using regex, or whatever you like!) to whittle down the list of available items in the initial RecursiveDirectoryIterator down to only those that you want. The following is only one example, written quickly just for you, of extending the RecursiveRegexIterator.

We start with a base class whose main job is to keep a hold of the regex that we want to filter with, everything else is deferred back to the RecursiveRegexIterator. Note that the class is abstract since it doesn't actually do anything useful: the actual filtering is to be done by the two classes which will extend this one. Also, it may be called FilesystemRegexFilter but there is nothing forcing it (at this level) to filter filesystem-related classes (I'd have chosen a better name, if I weren't quite so sleepy).

abstract class FilesystemRegexFilter extends RecursiveRegexIterator {
    protected $regex;
    public function __construct(RecursiveIterator $it, $regex) {
        $this->regex = $regex;
        parent::__construct($it, $regex);
    }
}

These two classes are very basic filters, acting on the file name and directory name respectively.

class FilenameFilter extends FilesystemRegexFilter {
    // Filter files against the regex
    public function accept() {
        return ( ! $this->isFile() || preg_match($this->regex, $this->getFilename()));
    }
}

class DirnameFilter extends FilesystemRegexFilter {
    // Filter directories against the regex
    public function accept() {
        return ( ! $this->isDir() || preg_match($this->regex, $this->getFilename()));
    }
}

To put those into practice, the following iterates recursively over the contents of the directory in which the script resides (feel free to edit this!) and filters out the .Trash folders (by making sure that folder names do match the specially crafted regex), and accepting only PHP and HTML files.

$directory = new RecursiveDirectoryIterator(__DIR__);
// Filter out ".Trash*" folders
$filter = new DirnameFilter($directory, '/^(?!.Trash)/');
// Filter PHP/HTML files 
$filter = new FilenameFilter($filter, '/.(?:php|html)$/');

foreach(new RecursiveIteratorIterator($filter) as $file) {
    echo $file . PHP_EOL;
}

Of particular note is that since our filters are recursive, we can choose to play around with how to iterate over them. For example, we could easily limit ourselves to only scanning up to 2 levels deep (including the starting folder) by doing:

$files = new RecursiveIteratorIterator($filter);
$files->setMaxDepth(1); // Two levels, the parameter is zero-based.
foreach($files as $file) {
    echo $file . PHP_EOL;
}

It is also super-easy to add yet more filters (by instantiating more of our filtering classes with different regexes; or, by creating new filtering classes) for more specialised filtering needs (e.g. file size, full-path length, etc.).

P.S. Hmm this answer babbles a bit; I tried to keep it as concise as possible (even removing vast swathes of super-babble). Apologies if the net result leaves the answer incoherent.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...