DesignBais v3.3 Startup problem

There is an issue with DesignBais v3.3 and mv.NET v2.1.4 where starting DesignBais from the startDB.asp web page consistently fails with "cannot connect"-type messages that don’t lead to any specific problem. If you have not seen this then you don’t need the solution presented here, but for some of us the problem seems chronic.

I’ve been told other sites never see this problem and I suspect the most stable servers are those that stay up all the time where DB is never reset. I’m also hoping this may go away with v4, but I sort of doubt it will unless the cause for the problem is diagnosed. Until then I’m presenting here a description of the problem and a temporary fix.

The problem always seems to be resolved when the program MVNET.HOUSEKEEPALL gets run from TCL in an account that is failing to initialize. This program deletes all of the MVNET.TMP_ files created by mv.NET. These files are used to manage connections on different ports. mv.NET includes a housekeeping function which can be executed at timed intervals, but that only runs when we are using pooled mv.NET sessions. DesignBais manages its own connections through a non-pooled mv.NET session, so housekeeping is never run automatically.

You can’t always tell which account is the source of the problem – I have at least 8 accounts with DB and mv.NET installed, for development, testing, demos, etc.. So when I first encountered this problem I tried the housekeeping option, but it didn’t always work, so I didn’t understand the real nature of the problem. I only use two mv.NET licenses for DB work. I always set one license for a dedicated connection to an account where I’m doing most of my work. I have another license for switching connections. When DB starts it initializes the dedicated connection if it can, logs into DBILOGIN for its own purposes, then logs into the first account that my current user (designbais) is authorized to log into according to the DB Users settings. So (I believe) that limits the scope to three possible accounts where failure can occur.

I experience the problem after I shutdown for the night and then restart the next day, or when I stop DB so that I can adjust my dedication connection allocation, and then I restart. Strangely, existing sessions don’t exhibit a problem if I’m changing accounts after DB is started, the problem is that I can’t start DB after it’s been shutdown.

Rather than going to each account and running housekeeping manually, hoping that I cleared the problem, my solution is to run housekeeping on all accounts, get rid of any old junk and start from scratch. To do this, I wrote a program that checks every account on the system to see if it has MVNET.TMP_ files, and then the files get deleted. The original version of this code used a LOGTO to get into each account and then run housekeeping, but that won’t work in a production environment where there are logon procs, application authentication, account passwords, etc. So I decided to use STEAL-FILE to pull the files into the current account, then housekeep the files locally. (I could have just deleted the files, doesn’t matter.)

To use this code, put it in some primary account with DB and MV.NET enabled. compile and catalog it as MVNET.HOUSEKEEPALL.ACCOUNTS. Then, right before you use the startDB.asp page, run this program. It will delete the temp files from all accounts so that you’re starting with a clean slate every time. I was going to suggest that this could happen automatically, by overloading the MVNET.START program or putting this into a logon proc that only gets executed when you do your first-time init, but I’d rather not get too cute about this. It’s a temporary fix to an occasional problem, and I don’t think it should be built into the environment. That said, there’s nothing wrong with putting it in your user-coldstart like this:
TO DBI.DEMO
MVNET.HOUSEKEEPALL.ACCOUNTS
Just make sure it’s catalogued in that account of course.

Please let me know if this doesn’t do the intended job or if you have other suggestions. Code is on next page.

5 thoughts on “DesignBais v3.3 Startup problem

    • I have had a few problems with my D3 virtual machine powering off unexpectedly while running DesignBais and mv.NET, and I was also advised to clear a number of DesignBais files, so I have modified your routine to do this:

      * MVNET.HOUSEKEEPALL.ACCOUNTS
      * Copyright 2006 Nebula Research and Development
      * This code may be freely used and modified without permission
      * but shall not be sold. Please leave these comments intact
      * and report issues to Nebula R&D.
      ** For use with mv.NET over D3
      * Description: Clean up all mv.NET temp files in a D3 system.
      * mv.NET cleans up after itself with a housekeeping process,
      * but only if it’s running pooled sessions and housekeeping
      * is turned on. This wipes all temp files in accounts that
      * may not be pooled.
      ** History:
      * Written: 2006sep02 Tony Gravagno, Nebula R&D
      * Amended: 2006oct10 Phil Short, Lonsvale Limited
      *          This version also clears DesignBais files
      *          (Code not tested by Tony)
      * First, housekeep current account
      WIPE.CMD = \MVNET.HOUSEKEEPALL\:@AM:\YES\
      EXECUTE WIPE.CMD CAPTURING OUT
      * Now go and steal tmp files from all other accounts
      ACCTS.CMD = \SSELECT MDS,, WITH A1 D] Q]\
      * Get a list of accounts
      EXECUTE ACCTS.CMD RTNLIST ACCTS
      EOF.ACCTS = 0
      LOOP
      * Get the next account
         READNEXT ACCT FROM ACCTS ELSE EOF.ACCTS = 1
      UNTIL EOF.ACCTS DO
         FILES.CMD = \SSELECT \:ACCT:\,, = MVNET.TMP_]\
      * Following assumes we have retrieval access to the MD
         EXECUTE FILES.CMD RTNLIST FILES CAPTURING OUT
         IF FILES # "" THEN
            CRT "CLEARING ":ACCT
            CT = DCOUNT(FILES,@AM)
            FOR FNUM = 1 TO CT
               EXECUTE \STEAL-FILE \:FILES<FNUM>:@VM:ACCT CAPTURING OUT
            NEXT FNUM
      * All files from that account are now in this account
            EXECUTE WIPE.CMD CAPTURING OUT
         END
      * DBISESSIONS, DBIXMLLOG and DBIAUDIT.
         FILES.CMD = \SSELECT \:ACCT:\,, DBISESSIONS DBIXMLLOG DBIAUDIT\
         EXECUTE FILES.CMD RTNLIST FILES CAPTURING OUT
         IF FILES # "" THEN
            CT = DCOUNT(FILES,@AM)
            FOR FNUM = 1 TO CT
               IF TRIM(FILES<FNUM>) # "" THEN
                  CRT ACCT:" ":FILES<FNUM>
                  SF.CMD = \set-file \:ACCT:\ \
                  SF.CMD := FILES<FNUM>:\ %q\:FILES<FNUM>
                  EXECUTE SF.CMD CAPTURING NULL
                  OPEN "%q":FILES<FNUM> TO QFILE THEN
                     SELECT QFILE
                     EOF =0
                     LOOP
                        READNEXT FID THEN
                           CRT "  deleting ":FILES<FNUM>:" ":FID
                           DELETE QFILE,FID
                        END ELSE
                           EOF=1
                        END
                     UNTIL EOF DO REPEAT
                  END
               END
            NEXT FNUM
         END
      REPEAT
          

      (Edit Oct 16, 2006, TG, formatted code for readability, added disclaimer "not tested by Tony". Thanks to Phil for the contribution. Comments on this revision are welcome.)

    • The problem Phil is fighting is that the machine itself is shutting down unexpectedly and leaving the entire DBMS in a corrupted state. Cleaning up the DesignBais files isn’t going to fix that problem, a full restore may be required. The program for doing mv.NET housekeeping was a response to a specific problem that is experienced in many systems. Clearing DesignBais files along with that seems to me like overkill, and I don’t see this as a general problem that needs to be solved in other sites. Because of the formatting issues seen above I may reformat Phil’s response in a few days. (The code should be revised for many reasons anyway – there’s no need for q-pointers, using NULL as a variable isn’t a good idea, and a simple clear-file would suffice rather than looping on a Delete.) Because this is a response to unique problem I may just remove these comments, though I thank Phil for his contribution and welcome others.

      I suggest DesignBais files should be purged only as necessary, which is less frequently than this mv.NET routine should be run. See my blog article on Housekeeping. I didn’t mention the DBIAUDIT log in that article, though I should have. I’ll add a comment about this file soon to that article.

    • Yes, a full restore was required, however the file clearing was suggested by Rick Weiser, and I quote "sometimes mv.net gets away from itself and corrupts some temp files". I didn’t realize I was the only one this applied to!
      And just in case I appear a complete Pick numpty:

      Yes, a clear-file (sic – clearfile surely?) would obviously suffice, but I wanted to list what was there at the same time;
      In Reality (my MV preference) CAPTURING NULL is the specific (and recommended) syntax if you don’t care about the output, I didn’t realize D3 didn’t have that option.
      I am intrigued as to why Q-pointers aren’t necessary though, unless this some quirk of D3 that I am not aware of and isn’t covered by the manuals.

    • You’ll notice in my article on housekeeping that I also suggested clearing DB files periodically, but this is completely unrelated to the mv.NET issues. It’s unknown at this time whether the mv.NET temp file corruption is due to some flaw in mv.NET itself or if it’s the result of some improper coding in the communications interface in DesignBais to mv.NET. (In my opinion no improper coding should lead to corrupted work files, so there may be bugs on both sides.) Part of the problem could be that DesignBais does not use mv.NET session management and (to my knowledge) does not have any process that invokes mv.NET housekeeping. If DBI assumes responsibility for session management then it’s my personal belief that they should also handle the housekeeping that needs to be done. I hope this will be built-in as a v4.x enhancement. BlueFinity is aware of the issues and I believe they are working with DBI to resolve them. I suspect the best approach to resolution will be for DBI to upgrade to mv.NET v3 in DesignBais v4.2. I will be happy to work with both companies to test fixes from both sides.

      Your issues with file corruption outside of the mv.NET temp files are unique to your system, DesignBais files are not getting corrupted at other sites. This issue with the temp files is nothing more than an inconvenience, which is why I approached this problem and (temporary) solution as I did. The major corruption that occurred at your site is the result of something system-specific and other sites do not need to worry about such things.

      I don’t want to get into a discussion about coding technique in this area, but I will briefly explain my notes related to your code:
      – I said "clear-file" rather than "clearfile" because that would take a single Execute statement, whereas Clearfile would require opening the file first. It’s one line vs two lines, and now we’ve both written much more than that so the attempt at brevity was in vain. To belabor the point, using Open and Clearfile would consume fewer runtime resources than pushing a level to execute a TCL statement, so this all boils down to one’s personal preference of less code vs faster execution time, and that leads to the question of whether using faster hardware justifies convenient coding techniques. This could be a whole blog entry in itself, let’s not delve into this topic here…
      – I haven’t checked other platforms but in D3 the word NULL is a keyword. Your code may get ported to another DBMS in the future where DesignBais uses mv.NET, and your code may break because the new system might not like the use of a reserved word where a variable is required.
      – Q-pointers aren’t required because D3 makes use of file paths. The code to open the MD, write a q-pointer, then open the pointer, can be replaced with the single statement OPEN ACCT:”,”:FILES<FNUM>:”,”. See docs for OSFI and File Paths for more info.

    • I agree, coding techniques could be a whole new blog … cross platform issues especially!
      Using NULL in that way is just one of many Reality-isms I will have to learn to forgo in the quest for cross platform compatability, much like my favoured LOOP WHILE READNEXT ID DO … REPEAT construct.
      The use of file paths in OPEN statements is something which also would need to be viewed with caution in that respect however – that syntax wouldn’t work on Reality.

Leave a Reply