-
Notifications
You must be signed in to change notification settings - Fork 0
Parsing Nested Replies from Users #58
Comments
All content in the reply from user is now parsed as part of the reply from user message, even if there are other delimiters within the reply including other replies from the user. The method of doing this relies on the reply from user ending delimiter === Additional information supplied by user ===
Subject:
From:
Date:
X-ECN-Queue-Original-Path:
X-ECN-Queue-Original-URL:
reply message content
*** Replied by: username at: dd/mm/yyyy hh:mm:ss ***
reply from ecn message content
=== Additional information supplied by user ===
Subject:
From:
Date:
X-ECN-Queue-Original-Path:
X-ECN-Queue-Original-URL:
reply from message content
===============================================
===============================================
*** Status updated by: username at: dd/mm/yyyy hh:mm:ss ***
status update message In the above item example item, there is a nested reply from a user within a reply from a user. The parent reply-from-user is what will be stored in a dictionary, and all the other nested delimiters in the reply-from-user will be stored as message content to the parent reply-from-user. However, if any of the reply-from-user ending delimiters are discarded, then the status update, which is not nested, and everything beyond, will be parsed as part of the message content for the parent reply-from-user. |
Rather than try to work around parsing improperly merged items, we will revert to parsing chronologically. While parsing chronologically, if/when we encounter unexpected syntax, we will stop parsing and insert an error showing the error encountered and the line number. This behavior is similar to how the Python debugger generates error messages and how the cclang C compiler follows expressive diagnostics Example of cclang error messages. Example: Properly formatted item
Example Item Parsed [
{
"type": "initialMessage",
"datetime": "2020-04-23T09:35:47Z",
"userName": "",
"userEmail": "",
"ccRecipients": [ ],
"content": "I need help with my computer.\n"
},
{
"type": "edit",
"datetime": "2020-04-22T16:39:51Z",
"by": "knewell",
"content": "They're computer is arms2106pc12\n"
},
{
"type": "status",
"datetime": "2020-04-23T10:35:47Z",
"by": "knewell",
"content": "Computer is online again\n"
},
] Example: Improperly formatted item
Example Item Parsed [
{
"type": "initialMessage",
"datetime": "2020-04-23T09:35:47Z",
"userName": "",
"userEmail": "",
"ccRecipients": [ ],
"content": "I need help with my computer.\n"
},
{
"type": "parseError",
"datetime": "2020-09-30T17:51:10Z",
"content": "Parsing error at 6:35. Expected date string but got '\n'
*** Status updated by: knewell at:
^"
}
] |
Presently, the only dictionary returned after encountering a nested delimiter is formatted similar to this from Item 11 in aae: {
"type": "parseError",
"datetime": "2020-10-02T10:59:57",
"content": "Nested delimiter encountered on line 131:\n\t *** Replied by: kevin at: 03/09/20 16:43:39 ***\n"
} While this isn't the best way to store this information, it does successfully identify nested delimiters as well as the line number associated with the error. |
def __errorParsing(self, line: str, lineNum: int, lineColumn: int, errorMessage: str) -> dict: This helper function was implemented in the
{
"type": "parse_error",
"datetime": "2020-10-06T15:38:40-0500",
"content": [
"Encountered Nested delimiter at 128:0",
"*** Status updated by: username at: 4/28/2020 14:21:42 ***"
]
} |
This issue has changed from being about nested replies to a more generic parsing feature. Parsing is already be tracked in another issue so this will be referenced in #2 and closed. |
Item 11 in the aae queue within q-snapshot contains the following reply-from-user section:
The nested section delimiters within reply from user cause the current
ecnqueue
script to interpret the sections as separate sections and not part of the message in thereply from user
.The text was updated successfully, but these errors were encountered: