[sonar-dev] using python's nosetests and coverage transparently

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[sonar-dev] using python's nosetests and coverage transparently

Stan Hu
I rely on nosetest to generate the coverage reports for all the code in my directory, but Sonar doesn't pick up all the test coverage for everything. I see many of these messages on the debug output:

DEBUG - Cannot find the file XXX, ignoring coverage measures

 In my sonar-project.properties, I have:

sonar.sources=.
sonar.python.coverage.reportPath=reports/coverage.xml

This appears to work okay, as it automatically picks up the many modules I have in the directory.

I understand from the documentation (http://docs.codehaus.org/display/SONAR/Python+Plugin) and this thread (http://comments.gmane.org/gmane.comp.java.sonar.general/21990) that coverage should be run manually. But I really would prefer not to do this, since I have so many modules to include.

Digging into the source code of sonar-python-plugin, I found that the code doesn't appear to deal with relative filenames in the coverage.xml file. If I generate the XML file with absolute filenames by running 'coverage' in a different directory, Sonar appears to register all the test coverage fine. This appears to be a non-standard way of doing things, since some software such as the Jenkins plug-in does not properly read absolute filenames.

What is happening in Sonar? Let's say I have the following file:

/home/world/foo/bar/hello.py

Let's assume the working directory is /home/world. The relative filename is foo/bar/hello.py. 

In PythonCoverageSensor.saveMeasures(), the org.sonar.api.resources.File.fromIOFile() code attempts some pattern matching to determine whether the file should be used.  It does this by trying to match the parent of the relative filename and progressively descending until it finds a match. It gives up if there is no more parent. For example:

Iteration 1:
cursor = foo/bar/hello.py
canonical parent = /home/world/foo/bar
/home/world == /home/world/foo/bar  -  No

Iteration 2: 
cursor =  foo/bar
canonical parent = /home/world/foo
/home/world == /home/world/foo - No

Iteration 3:
cursor = foo
canonical parent = None

Since there is no more parent, the code terminates and no match is found. sonar-python-plugin tosses out the coverage data since it can't find a common pathname.

Note that this isn't an issue when absolute filenames are used, since the pattern matching will descend one more level to find a match.

What can be done about this? I can see several fixes:

1) Convert the relative filename loaded by Cobertura import into an absolute filename. Use the current working directory to determine this, or just fix the pattern matching to use the canonical path if the file exists.

2) Use the Cobertura "sources" element at the beginning of the file to supplement the search path. (This requires a patch to coverage.py, which I have already submitted.)

Thoughts?
Reply | Threaded
Open this post in threaded view
|

Re: [sonar-dev] using python's nosetests and coverage transparently

compaqdrew
Hi Stan,

This is a little surreal, I ran into an analogous issue using unittest2 last week.  In addition to that there was similar behavior that I tracked down in the (unsupported) ObjC plugin.  So your description of the function behavior is not the first or even the second time I have traced the execution of this function in the last several days.  The fact that two engineers have independently been puzzled at this behavior at least 3 times in a few days is strong evidence to me that something should be done about it in core, but unfortunately I am not the person who makes that call.

The solution I arrived at was (in the unittest2 case) using sed to edit the XML file for absolute paths, and (in the ObjC case) patching the plugin to calculate an absolute path.  I am not satisfied with this, but it got the job done and honestly was not the largest on my list of hassles so I let it go.

Drew

On Feb 4, 2014, at 2:29 AM, Stan Hu <[hidden email]> wrote:

I rely on nosetest to generate the coverage reports for all the code in my directory, but Sonar doesn't pick up all the test coverage for everything. I see many of these messages on the debug output:

DEBUG - Cannot find the file XXX, ignoring coverage measures

 In my sonar-project.properties, I have:

sonar.sources=.
sonar.python.coverage.reportPath=reports/coverage.xml

This appears to work okay, as it automatically picks up the many modules I have in the directory.

I understand from the documentation (http://docs.codehaus.org/display/SONAR/Python+Plugin) and this thread (http://comments.gmane.org/gmane.comp.java.sonar.general/21990) that coverage should be run manually. But I really would prefer not to do this, since I have so many modules to include.

Digging into the source code of sonar-python-plugin, I found that the code doesn't appear to deal with relative filenames in the coverage.xml file. If I generate the XML file with absolute filenames by running 'coverage' in a different directory, Sonar appears to register all the test coverage fine. This appears to be a non-standard way of doing things, since some software such as the Jenkins plug-in does not properly read absolute filenames.

What is happening in Sonar? Let's say I have the following file:

/home/world/foo/bar/hello.py

Let's assume the working directory is /home/world. The relative filename is foo/bar/hello.py. 

In PythonCoverageSensor.saveMeasures(), the org.sonar.api.resources.File.fromIOFile() code attempts some pattern matching to determine whether the file should be used.  It does this by trying to match the parent of the relative filename and progressively descending until it finds a match. It gives up if there is no more parent. For example:

Iteration 1:
cursor = foo/bar/hello.py
canonical parent = /home/world/foo/bar
/home/world == /home/world/foo/bar  -  No

Iteration 2: 
cursor =  foo/bar
canonical parent = /home/world/foo
/home/world == /home/world/foo - No

Iteration 3:
cursor = foo
canonical parent = None

Since there is no more parent, the code terminates and no match is found. sonar-python-plugin tosses out the coverage data since it can't find a common pathname.

Note that this isn't an issue when absolute filenames are used, since the pattern matching will descend one more level to find a match.

What can be done about this? I can see several fixes:

1) Convert the relative filename loaded by Cobertura import into an absolute filename. Use the current working directory to determine this, or just fix the pattern matching to use the canonical path if the file exists.

2) Use the Cobertura "sources" element at the beginning of the file to supplement the search path. (This requires a patch to coverage.py, which I have already submitted.)

Thoughts?

Reply | Threaded
Open this post in threaded view
|

Re: [sonar-dev] using python's nosetests and coverage transparently

Waleri Enns
In reply to this post by Stan Hu
Hi Stan,

On 02/04/2014 09:29 AM, Stan Hu wrote:

> I rely on nosetest to generate the coverage reports for all the code in my
> directory, but Sonar doesn't pick up all the test coverage for everything.
> I see many of these messages on the debug output:
>
> DEBUG - Cannot find the file XXX, ignoring coverage measures
>
>   In my sonar-project.properties, I have:
>
> sonar.sources=.
> sonar.python.coverage.reportPath=reports/coverage.xml
>
> This appears to work okay, as it automatically picks up the many modules I
> have in the directory.
>
> I understand from the documentation (
> http://docs.codehaus.org/display/SONAR/Python+Plugin) and this thread (
> http://comments.gmane.org/gmane.comp.java.sonar.general/21990) that
> coverage should be run manually.

If by 'run manually' you mean "open a shell and type those line into it"
then youre misunderstanding the docs. It does say "Use Ned Batchelders
coverage package like this", which of course *includes scripting*. I
expect that any reasonable developer puts the generation of the coverage
report in a script. We're are automation experts, after all, and should
use our skills to ease our work, too ;)

I understand you want the plugin to do all this work for you (as e.g.
the java plugin does) by making all the necessary calls under the hood.
I do too ;)

The reason the plugin doesn't do that is because it doesn't have the
necessary knowledge how to make the right calls. It doesn't know where
your tests reside, which pythonpath they're expecting etc.
It maybe doesn't sound like a big deal, if one is looking at one
concrete project, but looking at all projects out there, one discovers
that there is quite of lot of variation in the task "executing the tests
and collecting the coverage information". By declaring the plugin
interface as: "just give me correctly shaped coverage information in
Cobertura format" we're putting the responsibility for making those
calls into the (IMO) right component: project-side scripting.

BTW: The java plugin can do this trick on maven projects because the
maven universe is "strongly ordered".

> But I really would prefer not to do this,
> since I have so many modules to include.
>
> Digging into the source code of sonar-python-plugin, I found that the code
> doesn't appear to deal with relative filenames in the coverage.xml file.

Yes, thats correct, because finding the sonar resource based on a path
and a bunch of source directories is considered a plugin-agnostic
problem and the job of the platform.

> If
> I generate the XML file with absolute filenames by running 'coverage' in a
> different directory, Sonar appears to register all the test coverage fine.
> This appears to be a non-standard way of doing things, since some software
> such as the Jenkins plug-in does not properly read absolute filenames.
>
> What is happening in Sonar? Let's say I have the following file:
>
> /home/world/foo/bar/hello.py
>
> Let's assume the working directory is /home/world. The relative filename is
> foo/bar/hello.py.
>
> In PythonCoverageSensor.saveMeasures(), the
> org.sonar.api.resources.File.fromIOFile() code attempts some pattern
> matching to determine whether the file should be used.  It does this by
> trying to match the parent of the relative filename and progressively
> descending until it finds a match. It gives up if there is no more parent.
> For example:
>
> Iteration 1:
> cursor = foo/bar/hello.py
> canonical parent = /home/world/foo/bar
> /home/world == /home/world/foo/bar  -  No
>
> Iteration 2:
> cursor =  foo/bar
> canonical parent = /home/world/foo
> /home/world == /home/world/foo - No
>
> Iteration 3:
> cursor = foo
> canonical parent = None
>
> Since there is no more parent, the code terminates and no match is found.
> sonar-python-plugin tosses out the coverage data since it can't find a
> common pathname.

The problem here is that your foo/bar/hello.py is to short, the
algorithm used in org.sonar.api.resources.File.fromIOFile() would find a
match if it would be at least one element longer, like
world/foo/bar/hello.py

And youre absolutely right, thats certainly an issue, as for the case of
sonar.sources=. and coverage --sources=. the toolchain simply doesnt
work. I overlooked this, as all my setups are like 'sonar.sources=src'
or something.

>
> Note that this isn't an issue when absolute filenames are used, since the
> pattern matching will descend one more level to find a match.
>
> What can be done about this? I can see several fixes:
>
> 1) Convert the relative filename loaded by Cobertura import into an
> absolute filename. Use the current working directory to determine this,

Feels like this approach is the most practical one.

> or
> just fix the pattern matching to use the canonical path if the file exists.

A lot of language plugins rely on that and it didnt change for ages,
AFAIK... Also, it would introduce a bunch of FS lookups into this
algorithm, which would decrease the performance. So Im pessimistic that
SonarSource would change this.

>
> 2) Use the Cobertura "sources" element at the beginning of the file to
> supplement the search path. (This requires a patch to coverage.py, which I
> have already submitted.)

Whats the status of that?

>
> Thoughts?
>


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: [sonar-dev] using python's nosetests and coverage transparently

Stan Hu
Waleri,

Thanks for you response. My comments are in-line.

On Wed, Feb 5, 2014 at 1:43 AM, Waleri Enns <[hidden email]> wrote:
Hi Stan,

If by 'run manually' you mean "open a shell and type those line into it" then youre misunderstanding the docs. It does say "Use Ned Batchelders coverage package like this", which of course *includes scripting*. I expect that any reasonable developer puts the generation of the coverage report in a script. We're are automation experts, after all, and should use our skills to ease our work, too ;)
 
Right. Sorry, by "run manually", I meant enumerate all the source directories as opposed to rely on the defaults (i.e report everything).

Incidentally, for coverage.py v3.5+, the right parameter is "--source=<python package>" (singular) instead of "--sources". 
 

I understand you want the plugin to do all this work for you (as e.g. the java plugin does) by making all the necessary calls under the hood. I do too ;)

The reason the plugin doesn't do that is because it doesn't have the necessary knowledge how to make the right calls. It doesn't know where your tests reside, which pythonpath they're expecting etc.
It maybe doesn't sound like a big deal, if one is looking at one concrete project, but looking at all projects out there, one discovers that there is quite of lot of variation in the task "executing the tests and collecting the coverage information". By declaring the plugin interface as: "just give me correctly shaped coverage information in Cobertura format" we're putting the responsibility for making those calls into the (IMO) right component: project-side scripting.

BTW: The java plugin can do this trick on maven projects because the maven universe is "strongly ordered".

The problem here is that your foo/bar/hello.py is to short, the algorithm used in org.sonar.api.resources.File.fromIOFile() would find a match if it would be at least one element longer, like world/foo/bar/hello.py

And youre absolutely right, thats certainly an issue, as for the case of sonar.sources=. and coverage --sources=. the toolchain simply doesnt work. I overlooked this, as all my setups are like 'sonar.sources=src' or something.

Right.
 

What can be done about this? I can see several fixes:

1) Convert the relative filename loaded by Cobertura import into an
absolute filename. Use the current working directory to determine this,

Feels like this approach is the most practical one.

I looked at the Cobertura XML generator source code in detail. The <sources> element defines the absolute pathnames where files can be found (i.e. the CLASSPATH), and the filenames are all relative filenames. If there are multiple pathnames specified, then it's possible that the XML parser would have to search across all the paths to find the right file. 

For example, let's say you had the following organization:

Cobertura paths (from <sources> element) = /subdir1, /subdir2
Relative Source files = test.py, test2.py

Now test.py could reside in either /subdir1 or /subdir2. Just as a Java interpreter would search CLASSPATH, the XML parser would need to search both paths to find test.py.

I think the right approach should be:

1) Build a list of candidate directories using the "<sources>" element in the Cobertura XML file if it exists.
2) Include the current working directory in this list.
3) Filter this list to include only the project source directories (specified by sonar.sources).
4) For each element in this list, build a candidate absolute filename from each filename entry in the XML file.
5) If a file exists, include into the coverage measures.



or
just fix the pattern matching to use the canonical path if the file exists.

A lot of language plugins rely on that and it didnt change for ages, AFAIK... Also, it would introduce a bunch of FS lookups into this algorithm, which would decrease the performance. So Im pessimistic that SonarSource would change this.

As I mentioned earlier, I think introducing filesystem lookups is unavoidable given the structure of the XML file. Just because it's been around for ages doesn't mean it's correct. I noticed that the Sonar Java Cobertura parser also ignores the "<sources>" element.
 

2) Use the Cobertura "sources" element at the beginning of the file to
supplement the search path. (This requires a patch to coverage.py, which I
have already submitted.)

Whats the status of that?

 

Reply | Threaded
Open this post in threaded view
|

Re: [sonar-dev] using python's nosetests and coverage transparently

Waleri Enns
On 02/07/2014 09:29 PM, Stan Hu wrote:

> Waleri,
>
> Thanks for you response. My comments are in-line.
>
> On Wed, Feb 5, 2014 at 1:43 AM, Waleri Enns <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Stan,
>
>     If by 'run manually' you mean "open a shell and type those line into
>     it" then youre misunderstanding the docs. It does say "Use Ned
>     Batchelders coverage package like this", which of course *includes
>     scripting*. I expect that any reasonable developer puts the
>     generation of the coverage report in a script. We're are automation
>     experts, after all, and should use our skills to ease our work, too ;)
>
> Right. Sorry, by "run manually", I meant enumerate all the source
> directories as opposed to rely on the defaults (i.e report everything).
>
> Incidentally, for coverage.py v3.5+, the right parameter is
> "--source=<python package>" (singular) instead of "--sources".
>
>
>     I understand you want the plugin to do all this work for you (as
>     e.g. the java plugin does) by making all the necessary calls under
>     the hood. I do too ;)
>
>     The reason the plugin doesn't do that is because it doesn't have the
>     necessary knowledge how to make the right calls. It doesn't know
>     where your tests reside, which pythonpath they're expecting etc.
>     It maybe doesn't sound like a big deal, if one is looking at one
>     concrete project, but looking at all projects out there, one
>     discovers that there is quite of lot of variation in the task
>     "executing the tests and collecting the coverage information". By
>     declaring the plugin interface as: "just give me correctly shaped
>     coverage information in Cobertura format" we're putting the
>     responsibility for making those calls into the (IMO) right
>     component: project-side scripting.
>
>     BTW: The java plugin can do this trick on maven projects because the
>     maven universe is "strongly ordered".
>
>     The problem here is that your foo/bar/hello.py is to short, the
>     algorithm used in org.sonar.api.resources.File.__fromIOFile() would
>     find a match if it would be at least one element longer, like
>     world/foo/bar/hello.py
>
>     And youre absolutely right, thats certainly an issue, as for the
>     case of sonar.sources=. and coverage --sources=. the toolchain
>     simply doesnt work. I overlooked this, as all my setups are like
>     'sonar.sources=src' or something.
>
>
> Right.
>
>
>         What can be done about this? I can see several fixes:
>
>         1) Convert the relative filename loaded by Cobertura import into an
>         absolute filename. Use the current working directory to
>         determine this,
>
>
>     Feels like this approach is the most practical one.
>
>
> I looked at the Cobertura XML generator source code in detail. The
> <sources> element defines the absolute pathnames where files can be
> found (i.e. the CLASSPATH), and the filenames are all relative
> filenames. If there are multiple pathnames specified, then it's possible
> that the XML parser would have to search across all the paths to find
> the right file.
>
> For example, let's say you had the following organization:
>
> Cobertura paths (from <sources> element) = /subdir1, /subdir2
> Relative Source files = test.py, test2.py
>
> Now test.py could reside in either /subdir1 or /subdir2. Just as a Java
> interpreter would search CLASSPATH, the XML parser would need to search
> both paths to find test.py.
>
> I think the right approach should be:
>
> 1) Build a list of candidate directories using the "<sources>" element
> in the Cobertura XML file if it exists.
> 2) Include the current working directory in this list.
> 3) Filter this list to include only the project source directories
> (specified by sonar.sources).
> 4) For each element in this list, build a candidate absolute filename
> from each filename entry in the XML file.
> 5) If a file exists, include into the coverage measures.

Thanks for the investigation. Can you file an issue, please? I will try
to look at it as soon as my schedule allows.

>
>
>
>         or
>         just fix the pattern matching to use the canonical path if the
>         file exists.
>
>
>     A lot of language plugins rely on that and it didnt change for ages,
>     AFAIK... Also, it would introduce a bunch of FS lookups into this
>     algorithm, which would decrease the performance. So Im pessimistic
>     that SonarSource would change this.
>
>
> As I mentioned earlier, I think introducing filesystem lookups is
> unavoidable given the structure of the XML file. Just because it's been
> around for ages doesn't mean it's correct. I noticed that the Sonar Java
> Cobertura parser also ignores the "<sources>" element.
>
>
>         2) Use the Cobertura "sources" element at the beginning of the
>         file to
>         supplement the search path. (This requires a patch to
>         coverage.py, which I
>         have already submitted.)
>
>
>     Whats the status of that?
>
>
> There is a pull request out
> (https://bitbucket.org/ned/coveragepy/pull-request/32/issue-94-include-the-sources-element/diff),
> but it is still under review.
>


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: [sonar-dev] using python's nosetests and coverage transparently

Stan Hu
Waleri,

Thanks! 

I attempted to file an issue under my JIRA username (stanhu), but it appears issues for SonarQube-related projects can only be created by people who permissions:



On Sat, Feb 8, 2014 at 2:56 AM, Waleri Enns <[hidden email]> wrote:
On 02/07/2014 09:29 PM, Stan Hu wrote:
Waleri,

Thanks for you response. My comments are in-line.

On Wed, Feb 5, 2014 at 1:43 AM, Waleri Enns <[hidden email]
<mailto:[hidden email]>> wrote:

    Hi Stan,

    If by 'run manually' you mean "open a shell and type those line into
    it" then youre misunderstanding the docs. It does say "Use Ned
    Batchelders coverage package like this", which of course *includes
    scripting*. I expect that any reasonable developer puts the
    generation of the coverage report in a script. We're are automation
    experts, after all, and should use our skills to ease our work, too ;)

Right. Sorry, by "run manually", I meant enumerate all the source
directories as opposed to rely on the defaults (i.e report everything).

Incidentally, for coverage.py v3.5+, the right parameter is
"--source=<python package>" (singular) instead of "--sources".


    I understand you want the plugin to do all this work for you (as
    e.g. the java plugin does) by making all the necessary calls under
    the hood. I do too ;)

    The reason the plugin doesn't do that is because it doesn't have the
    necessary knowledge how to make the right calls. It doesn't know
    where your tests reside, which pythonpath they're expecting etc.
    It maybe doesn't sound like a big deal, if one is looking at one
    concrete project, but looking at all projects out there, one
    discovers that there is quite of lot of variation in the task
    "executing the tests and collecting the coverage information". By
    declaring the plugin interface as: "just give me correctly shaped
    coverage information in Cobertura format" we're putting the
    responsibility for making those calls into the (IMO) right
    component: project-side scripting.

    BTW: The java plugin can do this trick on maven projects because the
    maven universe is "strongly ordered".

    The problem here is that your foo/bar/hello.py is to short, the
    algorithm used in org.sonar.api.resources.File.__fromIOFile() would

    find a match if it would be at least one element longer, like
    world/foo/bar/hello.py

    And youre absolutely right, thats certainly an issue, as for the
    case of sonar.sources=. and coverage --sources=. the toolchain
    simply doesnt work. I overlooked this, as all my setups are like
    'sonar.sources=src' or something.


Right.


        What can be done about this? I can see several fixes:

        1) Convert the relative filename loaded by Cobertura import into an
        absolute filename. Use the current working directory to
        determine this,


    Feels like this approach is the most practical one.


I looked at the Cobertura XML generator source code in detail. The
<sources> element defines the absolute pathnames where files can be
found (i.e. the CLASSPATH), and the filenames are all relative
filenames. If there are multiple pathnames specified, then it's possible
that the XML parser would have to search across all the paths to find
the right file.

For example, let's say you had the following organization:

Cobertura paths (from <sources> element) = /subdir1, /subdir2
Relative Source files = test.py, test2.py

Now test.py could reside in either /subdir1 or /subdir2. Just as a Java
interpreter would search CLASSPATH, the XML parser would need to search
both paths to find test.py.

I think the right approach should be:

1) Build a list of candidate directories using the "<sources>" element
in the Cobertura XML file if it exists.
2) Include the current working directory in this list.
3) Filter this list to include only the project source directories
(specified by sonar.sources).
4) For each element in this list, build a candidate absolute filename
from each filename entry in the XML file.
5) If a file exists, include into the coverage measures.

Thanks for the investigation. Can you file an issue, please? I will try to look at it as soon as my schedule allows.





        or
        just fix the pattern matching to use the canonical path if the
        file exists.


    A lot of language plugins rely on that and it didnt change for ages,
    AFAIK... Also, it would introduce a bunch of FS lookups into this
    algorithm, which would decrease the performance. So Im pessimistic
    that SonarSource would change this.


As I mentioned earlier, I think introducing filesystem lookups is
unavoidable given the structure of the XML file. Just because it's been
around for ages doesn't mean it's correct. I noticed that the Sonar Java
Cobertura parser also ignores the "<sources>" element.


        2) Use the Cobertura "sources" element at the beginning of the
        file to
        supplement the search path. (This requires a patch to
        coverage.py, which I
        have already submitted.)


    Whats the status of that?


There is a pull request out
(https://bitbucket.org/ned/coveragepy/pull-request/32/issue-94-include-the-sources-element/diff),
but it is still under review.



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email