-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add MSBuild binary log (.binlog) component detector #1250
base: main
Are you sure you want to change the base?
Conversation
99ffda7
to
2f8d76a
Compare
2f8d76a
to
135fdd2
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1250 +/- ##
=======================================
+ Coverage 88.9% 89.2% +0.2%
=======================================
Files 359 363 +4
Lines 27672 28663 +991
Branches 1784 1856 +72
=======================================
+ Hits 24613 25569 +956
- Misses 2674 2701 +27
- Partials 385 393 +8 ☔ View full report in Codecov by Sentry. |
src/Microsoft.ComponentDetection.Detectors/nuget/NuGetMSBuildBinaryLogComponentDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/nuget/NuGetMSBuildBinaryLogComponentDetector.cs
Outdated
Show resolved
Hide resolved
if (!topLevelDependencies.TryGetValue(projectEvaluation.ProjectFile, out var topLevel)) | ||
{ | ||
topLevel = new(StringComparer.OrdinalIgnoreCase); | ||
topLevelDependencies[projectEvaluation.ProjectFile] = topLevel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one project file may be evaluated and built several times, you'll often see several evaluations per .csproj. Should we pick the best evaluation somehow or is it fine to do for each evaluation? Restore does an eval, then the build does another eval, and each target framework is a separate eval. We probably want a union of all evaluations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll experiment a bit to see if I can find inaccurate results, but generally we really only care about the item groups that NuGet populates like @(RuntimeCopyLocalItems)
and I think that only happens during the Restore
evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The restore evaluation is going to be pretty useless (it gets packages downloaded but doesn't produce any of the related items). You will definitely need to do all of the inner-build evaluations so that you get the superset of references, e.g.
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>net8.0;net472</TargetFrameworks>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Newtonsoft.Json" Version="13.0.3"
Condition=" '$(TargetFramework)' == 'net8.0' " />
<PackageReference Include="System.Text.Json" Version="8.0.4"
Condition=" '$(TargetFramework)' == 'net472' " />
</ItemGroup>
</Project>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed a commit that covers this scenario; check the unit test for some notes, but the end result is that we're scanning the .binlog
correctly and after a dotnet build /bl
we'd pick up and report both Newtonsoft.Json/13.0.3
and System.Text.Json/8.0.4
and ultimately, that's the end goal: map a .csproj
(or rather anything that's not .sln
) to a set of package names and versions that came from it, regardless of the TFM that it came from.
topLevelDependencies[projectEvaluation.ProjectFile] = topLevel; | ||
} | ||
|
||
if (doRemoveOperation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit worried that you might be getting additem/removeitem from different evaluations in an interleaved way, and since they key by the project path this might get confused.
On the other hand it shouldn't happen because the binlog is a tree, and you're walking the tree linearly, so in theory each evaluation should be processed sequentially one after another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What might be an indication that the traversal is bad? If I try to process RemoveItem
but the item isn't already present? Or is that something that MSBuild won't allow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I guess, but as I said, maybe I'm just being paranoid
projectDependencies[packageName] = packageVersion; | ||
} | ||
|
||
project = project.GetNearestParent<Project>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, why are you walking up the project chain? I don't think these items flow to the calling project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found I needed this to track transitive dependencies. E.g., if library.csproj
has <PackageReference Include="Some.Package" />
and unitTests.csproj
has <ProjectReference Include="..\library\library.csproj" />
this will add Some.Package
first to library.csproj
and then crawl up the chain to unitTests.csproj
so that the dependency is properly reported for both projects. The only oddity with this (and I may have to special case it) is the .sln
file also appears in a Project
node, but I do seem to be getting the proper hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this. unitTests.csproj
should have Some.Package
in its assets file and pull assets out of it in its ResolvePackageAssets
. Can you share a log or project setup where this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We explicitly can't use the assets file, because it's not 100% correct, so we have to do it this way. The PR description explains a scenario where the assets file is wrong, but from my manual testing, this will result in a correct reporting of a project and any package that ultimately came from building it. I couldn't think of a scenario where this wasn't the case, but let me know if I missed one, it would make a great test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying that the targets that read the assets file should produce Some.Package
in the transitive case, so walking the project graph here doesn't make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...that read the assets file...
That's the problem, the assets file could be wrong, so I need to crawl it manually. The PR description explains a scenario where the assets file isn't correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assets file will still be a superset of what the project might use. You get the "real usage" by looking at the outcome of package resolution / conflict resolution.
If you walk the project graph you end up trying to replay parts of the build. We shouldn't do that when we can just observe what it did.
Also - the assets file isn't "Incorrect" here - it is correct from NuGet's perspective. It's just that the build has more policy that it applies to decide if it will actually use a package's contents. If that package contributes assets which are "older" than some other contribution it will be dropped. The most common case is framework, as you mentioned -- it's what we designed the feature for in the SDK. Technically it could be any conflict though - when any two packages try to provide the same file the build will compare them to decide who's copy wins.
As an alternative, you could also see how I think your approach is fine too (?) @jeffkl as an expert FYI, in case he wants to take a quick look |
actually we could diff the results of binlogtool listnuget with these results, and see if there are any discrepancies. Would be like an oracle! |
I explicitly don't want the contents of |
I see, makes sense! 👍🏻 |
1ce5859
to
910caa4
Compare
For those interested, I ran the package
This PR:
The difference being that the SDK ( |
if (!topLevelDependencies.TryGetValue(projectEvaluation.ProjectFile, out var topLevel)) | ||
{ | ||
topLevel = new(StringComparer.OrdinalIgnoreCase); | ||
topLevelDependencies[projectEvaluation.ProjectFile] = topLevel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The restore evaluation is going to be pretty useless (it gets packages downloaded but doesn't produce any of the related items). You will definitely need to do all of the inner-build evaluations so that you get the superset of references, e.g.
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>net8.0;net472</TargetFrameworks>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Newtonsoft.Json" Version="13.0.3"
Condition=" '$(TargetFramework)' == 'net8.0' " />
<PackageReference Include="System.Text.Json" Version="8.0.4"
Condition=" '$(TargetFramework)' == 'net472' " />
</ItemGroup>
</Project>
src/Microsoft.ComponentDetection.Detectors/nuget/NuGetMSBuildBinaryLogComponentDetector.cs
Outdated
Show resolved
Hide resolved
|
||
if (doRemoveOperation) | ||
{ | ||
topLevel.Remove(packageName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically I think this is going to cause problems if one TF of a project references a thing and the other gets it removed by framework unification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of this? I'd like to add a test to make sure we don't remove anything that shouldn't be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try this:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>net472;net8.0</TargetFrameworks>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="System.Text.Json" Version="6.0.0" />
</ItemGroup>
</Project>
This project DOES use STJ 6.0.0, but ONLY for the net472 TF; it should be removed by conflict resolution against the net8.0 framework in that TF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is a great example. I've updated it locally to track the add/remove operations per project evaluation ID so the evaluation of the project with net8.0
doesn't remove the add operation from net472
. I'll work on adding a test for this.
projectDependencies[packageName] = packageVersion; | ||
} | ||
|
||
project = project.GetNearestParent<Project>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this. unitTests.csproj
should have Some.Package
in its assets file and pull assets out of it in its ResolvePackageAssets
. Can you share a log or project setup where this is necessary?
<PackageVersion Include="MSTest.TestFramework" Version="3.5.1" /> | ||
<PackageVersion Include="MSTest.Analyzers" Version="3.5.1" /> | ||
<PackageVersion Include="MSTest.TestAdapter" Version="3.5.1" /> | ||
<PackageVersion Include="Microsoft.Build.Framework" Version="17.5.0" /> | ||
<PackageVersion Include="Microsoft.Build.Locator" Version="1.6.1" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a concern here. Are you running as a standalone .NET 6 application currently? If so, you won't be able to load MSBuild from new SDKs. Have you tried on a machine that has only .NET 8 installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work in the unit tests which should only have .NET6 installed (it's installing here and if I understand that action correctly, it uses global.json
to determine what to install, so no .NET 8 on the CI machine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the problem will be when you run the tool on a different machine which has only .NET 8 installed.
This corresponds to a project with multiple TFMs where the same package is imported in each case, but with a different version each time.
3ed303d
to
ffc2ece
Compare
["ResolvedSingleFileHostPack"] = ("NuGetPackageId", "NuGetPackageVersion"), | ||
["ResolvedComHostPack"] = ("NuGetPackageId", "NuGetPackageVersion"), | ||
["ResolvedIjwHostPack"] = ("NuGetPackageId", "NuGetPackageVersion"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it also possible to observe props/targets that NuGet might have imported from the package? For those you'd need to parse out the path, but I do think you could reverse it from any props or targets that are under $(NuGetPackageRoot)
// regular restore operations | ||
["NativeCopyLocalItems"] = ("NuGetPackageId", "NuGetPackageVersion"), | ||
["ResourceCopyLocalItems"] = ("NuGetPackageId", "NuGetPackageVersion"), | ||
["RuntimeCopyLocalItems"] = ("NuGetPackageId", "NuGetPackageVersion"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you read the value of these late enough in the build to understand that they haven't been reduced by ConflictResolution?
https://github.com/dotnet/sdk/blob/c3a8f72c3a5491c693ff8e49e7406136a12c3040/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.ConflictResolution.targets#L51
https://github.com/dotnet/sdk/blob/c3a8f72c3a5491c693ff8e49e7406136a12c3040/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.Sdk.targets#L301-L306
Adds a new NuGet component detector to scan MSBuild binary log (
*.binlog
) files.There was one major reason to add this: the current NuGet component detector that scans
project.assets.json
isn't always 100% accurate. This can happen when a given NuGet package has a transitive dependency on aSystem.*
package, but the installed SDK has a newer version of that file that is replaced at build time.One common example: a project has
<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="6.0.0" />
. That package has a transitive dependency onSystem.Text.Json/6.0.0
. No matter the SDK used to restore and build the project, theproject.assets.json
file will report that the packageSystem.Text.Json/6.0.0
was used and implies that it'll be present at runtime. However, if the6.0.424
SDK or newer was used to build, theSystem.Text.Json.dll
file from the6.0.0
package is removed and replaced with a newerSystem.Text.Json.dll
from the SDK's runtime package.To avoid reporting a false positive for a package that was replaced at build time (and therefore won't be present at runtime) the MSBuild binary log scanner was added.
If a binary log is generated with the
/bl
flag, every MSBuild event will be recorded. The new detector in this PR scans each event for the relevantAddItem
andRemoveItem
elements (corresponding to things like<SomeItemGroup Include="..." ... />
and<SomeItemGroup Remove="..." ... />
and uses that to build a 100% accurate dependency set.Most of the content of this PR is to support the tests to make them as reliable as possible through 2 different means:
.binlog
file for each test by running the appropriatedotnet build ...
command. This ensures there are no large binary test assets added to this repo; they're simply generated as needed.Microsoft.NETCore.App.Ref
,Microsoft.AspNetCore.App.Ref
, andMicrosoft.WindowsDesktop.App.Ref
.Future work
To fully prevent reporting false positives, it would be a good idea to merge the binary log detector and the
project.assets.json
detector so that if a binary log is present and covers a given.csproj
, only the binary log detector is used and fall back to the original detector otherwise. Without this work, the binary log detector will properly not reportSystem.Text.Json/6.0.0
, but theproject.assets.json
detector would, so the false positive will still appear.