-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[ty] Speedup project file discovery #19913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,10 @@ | ||
use crate::glob::IncludeExcludeFilter; | ||
use crate::{Db, GlobFilterCheckMode, IOErrorDiagnostic, IOErrorKind, IncludeResult, Project}; | ||
use ruff_db::files::{File, system_path_to_file}; | ||
use ruff_db::files::File; | ||
use ruff_db::system::walk_directory::{ErrorKind, WalkDirectoryBuilder, WalkState}; | ||
use ruff_db::system::{SystemPath, SystemPathBuf}; | ||
use ruff_db::system::{Metadata, SystemPath, SystemPathBuf}; | ||
use ruff_python_ast::PySourceType; | ||
use rustc_hash::{FxBuildHasher, FxHashSet}; | ||
use rustc_hash::FxHashSet; | ||
use std::path::PathBuf; | ||
use thiserror::Error; | ||
|
||
|
@@ -163,10 +163,15 @@ impl<'a> ProjectFilesWalker<'a> { | |
|
||
/// Walks the project paths and collects the paths of all files that | ||
/// are included in the project. | ||
pub(crate) fn walk_paths(self) -> (Vec<SystemPathBuf>, Vec<IOErrorDiagnostic>) { | ||
pub(crate) fn walk_paths( | ||
self, | ||
db: &dyn Db, | ||
) -> (Vec<(SystemPathBuf, Metadata)>, Vec<IOErrorDiagnostic>) { | ||
let paths = std::sync::Mutex::new(Vec::new()); | ||
let diagnostics = std::sync::Mutex::new(Vec::new()); | ||
|
||
let system = db.system(); | ||
|
||
self.walker.run(|| { | ||
Box::new(|entry| { | ||
match entry { | ||
|
@@ -232,8 +237,12 @@ impl<'a> ProjectFilesWalker<'a> { | |
} | ||
} | ||
|
||
let mut paths = paths.lock().unwrap(); | ||
paths.push(entry.into_path()); | ||
// If this returns `Err`, then the file was deleted between now and when the walk callback was called. | ||
// We can ignore this. | ||
if let Ok(metadata) = system.path_metadata(entry.path()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was wondering why you can't use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I thought about using the walk dir entry directly, but we, unfortunately, also need the permissions and, more importantly, the last modified time |
||
let mut paths = paths.lock().unwrap(); | ||
paths.push((entry.into_path(), metadata)); | ||
} | ||
} | ||
} | ||
Err(error) => match error.kind() { | ||
|
@@ -280,31 +289,24 @@ impl<'a> ProjectFilesWalker<'a> { | |
} | ||
|
||
pub(crate) fn collect_vec(self, db: &dyn Db) -> (Vec<File>, Vec<IOErrorDiagnostic>) { | ||
let (paths, diagnostics) = self.walk_paths(); | ||
let (paths, diagnostics) = self.walk_paths(db); | ||
|
||
( | ||
paths | ||
.into_iter() | ||
.filter_map(move |path| { | ||
// If this returns `None`, then the file was deleted between the `walk_directory` call and now. | ||
// We can ignore this. | ||
system_path_to_file(db, &path).ok() | ||
}) | ||
.map(|(path, metadata)| db.files().system_from_metadata(db, &path, metadata)) | ||
.collect(), | ||
diagnostics, | ||
) | ||
} | ||
|
||
pub(crate) fn collect_set(self, db: &dyn Db) -> (FxHashSet<File>, Vec<IOErrorDiagnostic>) { | ||
let (paths, diagnostics) = self.walk_paths(); | ||
|
||
let mut files = FxHashSet::with_capacity_and_hasher(paths.len(), FxBuildHasher); | ||
let (paths, diagnostics) = self.walk_paths(db); | ||
|
||
for path in paths { | ||
if let Ok(file) = system_path_to_file(db, &path) { | ||
files.insert(file); | ||
} | ||
} | ||
let files: FxHashSet<_> = paths | ||
.into_iter() | ||
.map(|(path, metadata)| db.files().system_from_metadata(db, &path, metadata)) | ||
.collect(); | ||
|
||
(files, diagnostics) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come? If
metadata
lets you cheaply query the file type, can you still return aResult
here?I mention this because it seems like we are otherwise careful not to allow a
File
to be backed by a directory.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we could. I had a versio where I did just that.
I considered it to be fine because
Files::try_system
also doesn't return aResult
but the file directly (even if the path maps to a directory). This where I consider the distinction betweensystem_path_to_file
which makes sure it adds a query dependency on status (so that your query re-runs if the status changes) and protects you from getting aFile
that points to a directory or doesn't exist.Either way, I'm going to revert all of this thanks to @ibraheemdev's suggestion