View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0005654||SUMo||Bug||public||2019-08-17 19:37||2019-09-09 21:14|
|OS||Windows 10 (64-bit)||OS Version||1903|
|Summary||0005654: SUMo breaks in debug mode|
|Description||It is not possible to run SUMo scan action anymore when switching to debug mode. It can be initiated but it will not finish with a result. It does vanish of the Windows task list and the Windows task bar instead after some time of processing.|
|Steps To Reproduce||Start SUMo with setting configured to debug mode,|
then initiate SUMo scan action.
|Additional Information||In order to test if the implemented solution of issue 5594, I switched SUMo settings to debug mode. Applying it leads to announcement that a SUMo restart is needed to take effect. So I did.|
After SUMo finished starting, I looked at the size of its log file. The text editor reports about 8500 records of size. Starting finished after almost 45 minutes.
As I updated for that test to latest version of SUMo, as reported in the above field, it recommends a scan action. So I did.
As these steps (loading, scanning) take quit some time with my SUMo verbose configuration, I take other Windows not related into foreground while SUMo was still working. I expected scanning to take somewhere between one and two hours. I did not get an Windows error code popup windows nor a Windows SUMo debug error code popup window. So I don't know after how much time SUMo disapeared silently of the Windows process list, task list and task bar. SUMo started its scan action in its Windows window.
After SUMo disappeared, I opened the SUMo log file in text editor to see if I may find any reason there for its silent disappearance. But with no success. I saw the state after loading in the log file, and the standard server polling for some time there. But I wasn't able to see the SUMo scan action start even reported in the log file.
What does this symptom imply?
Does SUMo catch an error and abort without writing out its buffers and without reporting the error situation when in debug mode?
Is this another variant of a SUMo crash?
How shall I test your implemented solution of issue 5594 if that solution only applies to debug mode but debug mode results in disappearing SUMo process (group)?
|Tags||No tags attached.|
||When repeating these steps (SUMo start, scan) with SUMo in foregound, the picture changes. It still ends but now with an error screen, see attachement. Confirming this error window closes all three SUMo windows although I couldn't read yet the content of this third screen yet and hence still without screenshot.|
||The observation with SUMo scan in foreground seems to make SUMo almost finish at least one scan action phase unto reaching these error windows doesn't change on the log file, not a single line of scan action found in log file!|
||Is log created with correct content before the scan operations ?|
How may I tell if the content is correct before the scan operations?
It looks correct. I couldn't find any anomalies attracting my attention (beside not reporting about the scan operation at all although performing several phases of that scan action). I don't have a log check program to report missing sections. I didn't have the impression of a section missing (except the logging of the scan phase).
This means that also the server heart beat isn't reported any longer after start of scan action. The loading and the detection phases finished without any abort beside missing logs. The SUMo scan abort happens after about one hour of consolidation like eliminating installers, duplicates and elements without version info.
||OK. Would you like me to revert to a lower number of lines in debug mode ?|
No. Neither you nor I have identified this as the reason for this symptom.
When I'm running into that test situation, SUMo was working already for almost an hour after its last write to log file operation, having written some windows updates to the display and having space for at least 1200 - 1500 records in the log file. It even didn't write its server heart beat.
Did I forget to attach the screenshot of note 3677. If yes, here it is. This error screen even doesn't use the configured language as opposed to other situations out of other operations and functions to report similar error code in the configured language.
||Yes, screenshot is missing. I'd be very interested by the error message (likely in french). Thanks !|
||Yes, it's French. Sounds similar to a former different one in English. But what will help you more is the address.|
||It seemed I forgot to attach this morning. In my previous note this afternoon, I attached it and still can't neither find it not got an error message in attaching. So next trial now.|
||Does Mantis not like the name of the attachement? I could see it in the preview. I could see it in uploading. Still I can't see it after end of upload. Trying again with unchanged name and with copy of changed name. Hopefully one remains uploaded and included.|
||As far as I understand, the number of lines doesn't seem to be the problem. If it helps, you might reduce the new limit at 65000 records. That's a little bit below a power of two if this phenomen has to do with type overrun (or signed/unsigned handling). That number might be still too small to cover complete startup, scanning and checking action. It should be sufficient to include every single such action. So we know how to cope.|
After the crash, are you able to see the number of lines in the log (by directly accessing the file using Windows explorer) ?
Otherwise, I fully concur, 60K lines should be already a quite large amount of debug log (if limit is not on object memory size) :-)
Usually, I can't access the log files via Windows Explorer. Navigating to the logs isn't possible. Don't know why although having configured the explorer to show hidden files and directories too. So to bypass this inconveniance, I executed a search in Windows Explorer to find the log file and kept that explorer Window open. This allowed me to access the log with a text editor when I wanted. This was after startup before the scan action resp. before the check action, and then again after the crash. As already written, when I opened the log file, there were already some server heart beats in the log. Their number depended how quickly I detected the end of startup or if I wasn't at my notebook. Once, the log was complete at 10000 records. Usually, it was at least almost 8500 records, often between 8700 and 8800 records. That's the situation after startup. Then I closed the editor and initiated the scan action. After about 50 minutes, the error message appeared and SUMo crashed when clicking on the error window, sometimes even without clicking on the error window. After the crash, I was still able to open the log file with text editor via the path in Windows Explorer. And as already written, there was not a single record added since the start of the scan action, neither of the first phases of scanning nor of the server heart beat. The size of the log file and the time stamp of the last modification of the log file didn't change unto the crash and also not afterwards. So again as already written repeatedly, I'm able to open the log file after th crash. The number of records is the number of lines. So usually, the number of lines was unchanged often between 8700 and 8800 lines before the crash and after the crash.
And 65000 records is more than 60k lines but less than 64k lines.
So could you already identify why which variable came into wrong value with the help of the error window?
And does this explain the observed behaviour?
Not able to find the exact root cause with the error message & crash address.
I'll set the limit to 60K and we'll see if it fixes the range issue.
||Problem still present in 5.9.7 as reported by @wolf|
I still consider the related issue 5594 as unresolved.
Independant if you consider the implementation chosen by you for issue 5594, it's not (directly) possible to test if even that implementation does what it claims. That was the reason for this issue 5654.
When trying to test the implementation for issue 5594 indirectly, I made some progress. When starting SUMo with my configuration in verbose mode, after the end of loading, the log file has a size of almost 8400 records. When ignoring SUMos demand for a scan action (due to update of SUMo version), I was able to perform a SUMo check action. The resulting log file size reached almost 25000 records. So this prooves that there exist configurations of SUMo for which the implementation of issue 5594 does what it claims.
When then initiating a SUMo scan action, the symptoms haven't changed. Logging does stop as soon as the scan action starts. This includes the SUMo server heart beat as well as the information of start of SUMo scan action. The first few phases of the SUMo scan action are continued to display in SUMo GUI. After about an hour or progress SUMo vanishes of the operating system process table without announcing a process dump, error message or something else. So the symptoms don't change if starting with 8400 records logged before the SUMo action or with 25000 records logged.
This lets me assume that the record size isn't the problem. The problem seems to be with logging or with SUMo scan action or a combination of both.
For me that sounds that sounds to need progress on issue 5594 to narrow done the area causing the problem and then consider if a debug beta version is needed to provide not only verbose mode but also different debug mode for further analysis. Can you reproduce the error situation or am I so far the only user with this symptom?
You declared issue 5594 as solved. So please explain me how to configure SUMo verbose logging so that it traces only the high level of SUMo start up, SUMo loading, SUMo checking and SUMo scanning in order to identify in which phase of SUMo scanning the symptoms of this issue 5654 arise, and if the preceeding phases were terminated successfully or in undetermined situation. With high level tracing, I mean only to report in log file the entering and leaving of the different phases with corresponding status, not the details on which files it is working. This should shrink log size dramatically. And how do I configure any conveniant log size limit with configurable units (records, kilobytes) for the verbose logging? And does it make sense to configure logging behaviour when logging reaches its configured size limit?
(In non-verbose mode I observe a FIFO mode which I usually consider the most meaningful. What might be an interesting mode is not universal, to keep the first lines of startup, and then apply FIFO or whatever makes sense only on the remainder. In my case, it was the first 32 lines for verbose mode.)
After finishing the identification of the phases of SUMo scan action producing which of the reported symptoms in verbose mode, then I need to know how to change the verbose mode configuration to provide deeper verbose mode reporting for the hopefully identified troublesome SUMo scan action phase, hopefully decoupling the logging symptom of the unintended abort symptom.
You're correct. Fix proposed for 0005594 is only a very partial implementation.
I've cloned with a dedicated number 0005675 for the full feature as requested
Thanks for that feedback. But that's on 5594, not on this issue. You decided to put 5675 on long term fix. So the question remaind how to analyze the logging symptom in order to get logging working also for the SUMo scan action while in debug resp. verbose mode.
Concerning terminology, the SUMo behaviour in the corresponding mode I consider verbose, not debugging. The two can be related or independant. May you really provide debugging options for released versions of SUMo? Or should such debugging options (as opposed to verbose options) be limited to SUMo beta variants resp. versions?
You kept this issue in feedback required state. I couldn't read what feedback you require.
I don't have a problem if you decide to address the logging feature temporarily by another undocumented interface, the signal interface. I also have (dated) developer background. So if sending a defined signal to (which) SUMo process may toggle logging or logging level, I may cooperate.
||I'll switch back to feedback when i'll have more precise idea of what I'd need to progress on this one.|
Did you change anything with version 5.9.8 concerning the verbose behaviour for scan action?
The problem is still present. Can't tell which window and error comes first. Now I see two additional windows as I succeeded again to have only the SUMo windows as well as the Windows task manager open on the Windows desktop while all other applications have either their windows minimized or no window at all (services). So you'll find the new attachement now of the whole desktop for that situation in your email. Mantis is configured not to allow such an attachement size.
Startup of SUMo succeeds. This includes a light scan as you already wrote. So what's the difference to the SUMo scan action, both executed in verbose / debug mode?
||Sorry, I meant "light" by "Normal scan from an empty list", to check if the problem occurs with a less stressing environment for SUMo.|
|2019-08-17 19:37||wolf||New Issue|
|2019-08-18 01:21||wolf||Note Added: 0003677|
|2019-08-18 01:29||wolf||Note Added: 0003678|
|2019-08-20 18:19||Kyle_Katarn||Relationship added||related to 0005594|
|2019-08-20 18:21||Kyle_Katarn||Assigned To||=> Kyle_Katarn|
|2019-08-20 18:21||Kyle_Katarn||Status||new => feedback|
|2019-08-20 18:21||Kyle_Katarn||Note Added: 0003681|
|2019-08-20 19:48||wolf||Note Added: 0003683|
|2019-08-25 16:07||Kyle_Katarn||Note Added: 0003687|
|2019-08-26 08:20||wolf||Note Added: 0003690|
|2019-08-26 14:24||Kyle_Katarn||Note Added: 0003694|
|2019-08-26 19:45||wolf||Note Added: 0003695|
|2019-08-26 19:47||wolf||Note Added: 0003696|
|2019-08-26 19:50||wolf||File Added: sumo.error.20190818.png|
|2019-08-26 19:50||wolf||File Added: sumo.err.20190818.png|
|2019-08-26 19:50||wolf||Note Added: 0003697|
|2019-08-26 19:57||wolf||Note Added: 0003698|
|2019-08-26 22:17||Kyle_Katarn||File Deleted: sumo.err.20190818.png|
|2019-08-26 22:18||Kyle_Katarn||Note Added: 0003699|
|2019-08-26 22:18||Kyle_Katarn||Target Version||=> 5.9.7|
|2019-08-27 02:19||wolf||Note Added: 0003700|
|2019-08-27 09:45||Kyle_Katarn||Note Added: 0003701|
|2019-08-27 09:45||Kyle_Katarn||Status||feedback => acknowledged|
|2019-08-28 09:28||Kyle_Katarn||Status||acknowledged => resolved|
|2019-08-28 09:28||Kyle_Katarn||Resolution||open => fixed|
|2019-08-28 09:28||Kyle_Katarn||Fixed in Version||=> 5.9.7|
|2019-08-29 23:20||Kyle_Katarn||Status||resolved => feedback|
|2019-08-29 23:20||Kyle_Katarn||Resolution||fixed => reopened|
|2019-08-29 23:20||Kyle_Katarn||Note Added: 0003703|
|2019-09-01 18:34||wolf||Note Added: 0003711|
|2019-09-01 19:06||wolf||Issue cloned: 0005674|
|2019-09-01 19:06||wolf||Relationship added||related to 0005674|
|2019-09-01 19:07||Kyle_Katarn||Note Added: 0003712|
|2019-09-01 19:22||wolf||Note Added: 0003714|
|2019-09-01 20:30||Kyle_Katarn||Status||feedback => acknowledged|
|2019-09-01 20:30||Kyle_Katarn||Note Added: 0003717|
|2019-09-01 20:31||Kyle_Katarn||Fixed in Version||5.9.7 =>|
|2019-09-01 20:31||Kyle_Katarn||Target Version||5.9.7 =>|
|2019-09-09 18:19||wolf||Note Added: 0003718|
|2019-09-09 21:14||Kyle_Katarn||Status||acknowledged => feedback|
|2019-09-09 21:14||Kyle_Katarn||Note Added: 0003719|