Hi everyone,
I’ve been working on a script that can read in Apache logs over standard input, and then extract information such as the user agent, the request method, etc. It’s a PHP script and I’ve been thinking about making a super fast Rust version.
Surprisingly, however, I seem to be finding that PHP is much much faster than Rust. As a PHP developer I am not surprised that PHP is fast per se, but I would expect it to be slower than Rust, because PHP needs to parse, interpret, compile and run the code, whereas in Rust it’s all precompiled, pre-optimized native code.
Can someone shed some light as to why my Rust program is so slow? I’ve taken the code and boiled it down to a pair of simple examples. I’m not using the captures in the Rust file, but I will be needing them in the final script so is_match
would be a big performance improvement but is not an option here.
Here’s how I test the difference. I’m using PHP 7.3 and a log file of 64,000 lines:
cargo build --release; hyperfine 'target/release/apache-log-parser < test.log' # 185ms
hyperfine 'php example.php < test.log' # 84 ms
Rust:
extern crate regex;
use std::io;
use std::io::prelude::*;
use regex::Regex;
fn main() -> io::Result<()> {
let regex = r#"^([^ ] ) ([^ ] ) ([^[] ) [([^]]*)] "([A-Z] ) ([^"]*)" ([0-9] ) ([0-9] ) "([^"]*)" "([^"]*)"$"#;
let regex = Regex::new(regex).unwrap();
for line in std::io::stdin().lock().lines() {
let line = line.unwrap();
if let Some(_) = regex.captures(&line) {
println!("{}", line);
}
}
Ok(())
}
PHP: