home

tech blog

HTML5 valid

introductory notes:

1. For time zone purposes, I am near Atlanta and use my local timezone. Atlanta is the same as New York (America/New_York).

2. Starting sometime in December, 2021, I am often addressing one of my apprentices; he is the "you."

3. Drafts / previous versions of this page are on GitHub. Usually the live version is way ahead of GitHub.

4. Often I'm posting to GitHub and not commenting here. If you click on my repos, they are sorted by the latest changes.

April 27, 2022 (00:07)

I have started an Ethereum / NFT project.

April 11, 2022 (01:14)

It's been an interesting few weeks. There is a lot to report, but not sure how much I'll get out now. My immediate purpose was to record for myself and whoever finds this the details on the MongoDB 5.x CPU requirements. For Intel, it needs the "Sandy Bridge" (micro) architecture. WikiP tells us this came out in 2011 with the Core i3.

It would appear a lot of people didn't get to the bottom of this. Fortunately, I figured it out fairly quickly and reverted to 4.x. Here are some of the error messages I see, in case Big Evil Goo picks it up, and this helps someone. I probably figured it out because I'm well aware I have ancient hardware. It's about 2009 vintage. It was a $3 - $5k computer at the time, I'd imagine. I got it for something like $160 in 2017.

(core=dumped, signal=ILL)
status=4/ILL
core-dump
core-dumped
Illegal instruction (core dumped)
kernel: traps: mongod trap invalid opcode
mongod.service: Control process exited, code=dumped status=4
mongod.service: Failed with result 'core-dump'.
dumped core

March 18, 2022

web server access log analysis - README redone

In answer to "usual" (recent CS-grad) apprentice, I rewrote the README of my web logs repo.

create a git branch

the right way

I found the note file where I saved the commands, but it took me way too long to figure out what I had branched. "grep -R branch | grep -P "0\.5"" It was my cms. Note that the comments from further below on "master" versus "main" and label versus branch apply.

git branch 0.5
git checkout 0.5
git add -A .
git commit -m "trying to create branch 0.5"
git push --set-upstream origin 0.5
git checkout master
the long way / sort of wrong way / correcting mistakes

The following were the actual commands where I went around in a circle. It did the job, but it's not the perfect commands. I'm "saving" this here because I'm removing it from the README (see link below) . Note there is ambiguity between the words "main" branch and "master" branch. The libtard speech Stasi deemed "master" inappropriate. Git seems to be in the middle of transitioning both its gender and use of those words. I don't remember which is the right word as of that version or the current version. Here is the link to the branch I created. In hindsight, a label rather than a branch may have been more apt.

Note on branching:

The key is to create the branch and THEN to change the branch.  I think you have to commit it.  Make sure it shows up in origin / on github.  

git branch 0.32
git checkout 0.32
git add -A .
git commit -m "trying again to create branch"
git push --set-upstream origin 0.32
git checkout main
git add -A .
git commit -m "removing all from mai[n] temporarily"
git checkout 2a7231bda956def5e205e910062b7f3f4b23c046 cli/t1.php
git checkout 2a7231bda956def5e205e910062b7f3f4b23c046 README.md
git checkout 2a7231bda956def5e205e910062b7f3f4b23c046 parse.php
git add -A .
git commit -m "new main or master branch"
git push

March 17, 2022 - web log introduction for "sales guy" apprentice (started ca 22:40, updated: 3/18 17:03)

Note that on 3/18 00:15 and possible later, I am adding stuff at various points. It's not always the furthest down is the newest. At 17:03 I added one last note just below, and now I will almost certainly close the entry.

One more note, out of order. The line below shows a fetch / "GET" of my apprentice page. I should not assume that's obvious.

Below is a line from my web server access log. I'm going to separate it into 2 lines for page aesthetic purposes.

66.249.70.62 - - [17/Mar/2022:16:33:55 -0400] 689777 "GET /t/9/02/apprentice_steps.html HTTP/1.1" 200 3646 "-" 
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 

This is what I've been nattering on about and built a vast edifice to process.

The lines represent every page and other object query to a site. That is, the fetch of the .html page, all of the JavaScript files individually, all images, all external css, etc. That particular line is from the Big Evil Goo Bot. That's how Goo builds its search data: its robot queries pages, runs them through the wrongthink filter, and then maybe indexes the page for the search engine, if the page doesn't threaten various of their sacred cows.

The first field--66.249.70.62--is the IP address of the robot--where the robot is "calling from." I picked a line from the Goo Bot so that I could reveal the IP address. I feel much less than zero moral obligation to protect the privacy of their IP addresses. If you run "whois 66.249.70.62" without quotes from the Linux command line, after perhaps installing whois, you'll see that Google owns that IP address. There are also web services for whois.

The next 2 fields "- -" are for if a user is logged in with Apache's ancient password method that very few use anymore. At least, one of the "-" is for that. (Apache doc.) As I think about it, I should probably remove those from future logs, but not now. Anyhow... They are always "- -" in my case, which is why I should consider removing them.

Next is the time with GMT / UTC offset, set to my local time. Next is where I added microseconds. My EXACT log format is displayed on GitHub. The "%u" is where I add microseconds. This goes to my obsession with unique index methods. I was trying to uniquely define lines, but it didn't work. There are HTTP "408" response code lines that can happen in the same microsecond, or at least are logged as the same microsecond, so it doesn't work.

the "ancient" Apache passwords

Sales guy apprentice (SGA) asked about the ancient Apache login method. I call it ancient because I read about it and used it briefly years and years ago, and I would guess it's much older than when I used it. For reference, here is the password feature in Apache's doc.

SGA asked for clarification. Their password feature allowed you to create a password in a file on the same server as the web server. When you say "their side," Apache as an entity was not involved. That's not one of the "sides." It's "their side" in the sense that the password file was on the same side (server-side) as the web server.

I think you can integrate their password into a database, but I'd imagine that was / is clumsy. That's one reason it wasn't / isn't used. Another is that I have no idea to what degree they kept up with password hashing. The PHP language has built in more and more sophisticated hashing, for example. I don't remember what the permission issues were around the password file, which is another drawback. I could probably think of more. I'll just say that putting user data in a database is just how it's been done for decades, and this is one instance where I have no objection to "how it's done."

back to log files

Back to the log file, I have debated removing microseconds because they don't do what I want, but, then again, there is very likely information to derive from the microseconds, so I'll keep them.

The next field is "GET /t/9/02/apprentice_steps.html HTTP/1.1" GET is one of a handful of "HTTP request methods" or actions or verbs. I have not crunched the data on it, but I believe it's safe to say that GET is by far the most common. GET is what the browser (or bot) uses to get a page in most cases. Then "HTTP/1.1" is the HTTP version used. I suspect that all my requests right now are HTTP/1.1. HTTP 2.0 exists but is early in its support, last I checked. HTTP 2 goes to a binary rather than human-readable text format, which is surprising. I am not excited about a binary format, so I am in no hurry to adopt. As of now I see no technical pressure to adopt.

200 is the HTTP response code, such as listed / linked above. 200 is the famous "200 OK" It is probably / hopefully the most common response. Although it would be interesting to crunch numbers on that. It probably is the most common, but there are quite a few hack attempts that result in "404 not found." The 404s might be 5 - 10%, as I think about it. I call it famous because I know someone with a bumper sticker "200 OK," so it must be famous, right? To elaborate a bit, the 200 in this case means that the page exists and is accessible / has proper permissions, and it was served up successfully. (Successful in that Apache sent it. It doesn't mean it was received, although it likely was.)

3646 is the number of bytes returned. When I come back, I'll compare that to the HTML on the disk. "$ ls -l /[document_root]/t/9/02/apprentice_steps.html" shows 7984. If do control-shift-I (captial I / India) in a browser (Firefox and others) and go to the "Network" tab and refresh a given page, you'll often note in the "Response Headers" an entry "Content-Encoding gzip" Gzip is a compression algorithm like the old WinZip / .zip. So the 7984 compresses down to 3646. I don't remember if that number includes the size of the headers.

"-" is the referrer, or the site that referred the browser to that page. For example, sometimes I see "https://www.google.com/" which means someone found the page from Goo Search. Sometimes I'll see referrals from DARPA LifeLog (Big Evil Facebook) or Goo's "Tube." There are also internal referrals such as JavaScript pages being "referred" from the HTML page.

"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" is the now-infamous-to-me "User Agent." "User Agent" is an odd term. It means the program that sent the request. In this case, it's the Goo Bot. I list lots and lots of user agents elsewhere.

return to Apache's passwords

SGA is curious, so, here we go. "$ htpasswd -c /var/www_ex/apache_pwd_ex sga" creates the password, with user sga. I type the password as I create the file entry. The super-secret password is "password123" without quotes. The "protected" directory / file is here. I'm following the instructions as linked above, and I'll link again. I created www_ex for this purpose. You want to create a file outside of the Apache file tree (DOCROOT / document root) because you don't want Apache to serve the password file. Apache (www-data user) needs permission to read the file, though.

March 5, 2022, starting 13:04, winding down (maybe) at 16:26, with many long breaks in between, maybe posting 19:06

"PHP Fatal error: Uncaught Error: Call to undefined function mysql_connect() in /...path/[implied index.php]" You are using PHP 8.x, right? Or at least 7.x? Have they already aliased the 1990s "mysql_connect()" to be mysqli_connnect()? (MySQL goes back to '95.) The PHP documentation does not indicate that. In any event, I would use mysqli_connect(). You'll give old-timers like me hives just looking at the thought of the old mysql_connect(). Note that the "old" mysql_connect() is gone, per the doc.

I ranted about this several weeks ago. CERTAIN CLIENTS who sent me screaming were still using the original mysql... functions in PHP 5.x. The original mysql functions were deprecated in roughly 2013 and removed from the language in 2015. (The years are from memory.)

apt-cache policy php-mysql
php-mysql:
  Installed: 2:8.0+82~0build1

This leads me to believe that mysql_connect is an alias, but, still, for my peace of mind, if nothing else, consider using mysqli. If you want to do some research and see where it's documented as an alias, I'd probably live with it, but, given that the documentation still shows deprecation, I would avoid the mysql functions.

With all that said, I just got a reply that he's using mysqli now. Good.

Now, back to the error itself. I'm not sure when (or if) it became relatively clear to me that it meant you had to install something separately. I've had variants of this drive my nuts fairly recently, though. I'll come back to that.

For future reference, consider the difference between the mysqli doc and substr() (or string functions generally) doc. The mysqli doc natters on at some length. That's a hint that something needs installing. I would probably not recommend reading all that nattering. Given that you're reading the general PHP doc, it may not be helpful towards the simple question of "How do I install this in Ubuntu?" Also, there are all sorts of references to "--with-mysqli" and related switches that I have never needed to worry about because I'm not compiling PHP from scratch. When you install php-mysql, it does the same thing as the switch.

It is harder than it should be to know which package to install in Ubuntu. Big Evil Goo usually answers it easily enough, but it's the sort of thing that may be quicker when I'm around to show you what I have installed.

By contrast, the string functions say, "There is no installation needed to use these functions; they are part of the PHP core."

One variant that drove me nuts was when one of the MYSQL(I) constants was not defined. I didn't know if that was a PHP package thing or a Stoopal thing. It turned out that I hadn't installed mysqli. That was not obvious to me.

A worse variant is when the rewrite engine isn't enabled. The manglement (sic) systems should go out of their way to check that it's working. I wrote code to do it fairly quickly, so they can. Otherwise, you get the damnest errors. I might have spent 3 hours fussing with one of them until I finally realized from relatively subtle signs, using the debugger, that the lack of rewrite engine was the cause. These issues are some of the many issues I have with manglement.

On an apprentice procedural note, my theoretical standard for something like the error above is to spend 20 minutes thumping on it. Then ask me. It's not worth that sort of frustration. I'm not sure you learn anything from it. I've said it before, but it's worth repeating: I remember spending 3 hours in 1992 trying to figure out SYNTAX errors. It would have been lovely to have someone point them out. I don't think I learned much from all that. I just got very frustrated and probably would have gotten out of dev if I weren't so damn stubborn. I think I've told the story here of someone who did quit dev in 1985 because his compiler simply said "syntax error" with no line number or hints or specific syntax violations.

In 1992, I sort of kind of learned to break code into smaller and smaller pieces--isolate the problem more and more tightly--to figure out where the problem is. It would have been nice to have someone state that in the context of my problem as a general rule. I sort of learned it, but I'm not sure it really stuck for a while. I think I've complained before of people who post to StackOverflow with 100 lines of code. Their problem is within 3 lines, and they should have isolated that before asking the question. That applies to StackOverflow, though. In the case of asking me, I'll go with the 20 minute rule for now. And maybe that should be 10 - 15 minutes.

To recent CS grad apprentice, do you want me to link to your blog? I'll do it if you want, but... I suppose I can ask this openly because it goes for any apprentices. You may have noticed that I'm getting more and more, shall we say, testy, on this website. I'm getting closer to the point where I say publicly what I might have only said privately in the past. I stand the risk of eventually being deplatformed or condemned by the ADL--those being about the same thing. I am at risk of being put in the same category as Putin and Foreign Minister Lavrov are right now. So, do you want to be literally linked to (or from) me? I would not be at all offended if you said "no."

In other news, CS-g apprentice wore a hoodie to a recent job interview, and even "swore a little bit." He is doing an experiment in being one's self. Maybe there is hope for me, but maybe not. He's not a fan of Putin and Lavrov. I'm not trying to be insulting, but I doubt he has consciously heard of Lavrov. Indeed, Lavrov has likely been somewhat obscure until the last few weeks. I know of him because friends who keep much better track than I do have been a fan of Lavrov's for many years. I didn't know until just now that Lavrov has been FM since 2004, so I've heard about him for quite a while.

Russian military fandom aside, I suppose my equivalent would be barefoot for an interview. Decades ago I heard a reliable report of an office in Orlando being commonly barefoot. Actually, I would be wearing my 2009 Holy Name Cadets Drum and Bugle Corps hoodie and barefoot during a light snow.

Anyhow, where were we?

You mentioned PHP and (client-side) JavaScript and loosely implied that learning them was in conflict, or that server-side (PHP) code had much higher priority. Many weeks ago I talked about the potential for 4 layers of code filtering from the database to the front-end. PHP and client-side JavaScript are part of the same team to process web data. Sometimes it makes sense to do most of the processing on the server-side and sometimes on the client side, and sometimes it isn't clear which is better.

Things are processed on server side (whatever the code / language is), and then the result is then sent over to the client. The client doesn't see any code, just the output. You'd think i would already have known that but man it didn't click until a couple days ago.

Now, i'm guessing JS is processed on client side?

Correct on both points. More specifically, client-side JavaScript, which is what you're talking about in context, is processed in the browser. That's why you see it debugged in the browser's dev tools, and you can't see the JS execute on the server-side.

Which leads me to Node.js. Node.js is server side JavaScript. Just in the last few weeks it's finally started to fully sink in WHY the MEAN stack goes together. It's somewhat of an equivalent to your "revelation" above, but I don't have an excuse for not having my revelation much sooner. An apprentice guideline, to repeat, is that Kwynn is not all-knowing, and I suppose I can sometimes be kinda dumb.

Actually, let's leave Angular and Express aside. Angular is a client-side JavaScript library, so the JS part matters. MongoDB's equivalent of SQL is JSON (BSON) formatted. The distinction of what exactly it is--JSON for BSON--isn't at issue for this limited discussion. I'll just call it JSON for now. In any event, queries including data entry are JSON formatted. A query result is essentially a JS object with some portion of the JS language to operate upon it. I don't yet know the extent of JS in Mongo objects.

One point being that I've been doing all sorts of machinations to harmonize PHP and Mongo. My life might be easier if I just learned Node. To this end, I installed a Node debugger early this morning. Let us all repeat in chorus, Kwynn's Rule of Dev #1: never dev without a debugger.

I spent 2 - 3 hours researching various debuggers, re-installed Eclipse, and tried getting Node to work in Eclipse. So there's Rule #1, and then there is my dedication to open source tech. So, when those two come up against my disgust and perhaps hatred of SatanSoft, it seems that I go with open source and Rule #1. After resiting it, I installed the MIT / open source licensed Visual Studio Code for Node. It works. Perhaps I should set some apprentice or another on finding a non-SatanSoft product.

"You keep saying don't code (such as php) without a debugger. How important is this over say doing 'php index.php' and having it give you the errors at the terminal?" They are not equivalent. Running PHP as CLI is different than using a debugger. There is no substitute for a debugger. Just to be clear on terms, by a debugger I mean software that allows you to set breakpoints, step through code, and watch variables as you step through.

I should explain why they are different. First, I'll back up and discuss CLI PHP versus "web" PHP. Remember that those two types of PHP have separate php.ini files. Also remember that if you change the /etc/php/8.0/apache2/php.ini, you have to restart Apache for the change to take effect. Anyhow, for dev purposes you should turn on "Display_Errors" or whatever the .ini variable is for showing errors in web PHP. That is, you should see the same error in both CLI and web.

By default, display errors is off in the web version because in theory you don't want the public to see the errors, because it could give an attacker insight into how to attack your site. Depending on what you're doing, I vote for turning display on even for a public site. I am almost certain display errors is on for kwynn.com.

With that said, one of the reasons why it's helpful to run CLI on a web application is that the error may not show up in the HTML output, even if display errors is on. Sometimes you'll get a totally blank HTML page. kwutils.php changes the way errors are handled, so I'm somewhat confused on this point because I'm usually using kwutils these days. But, as I remember, you get blank when the error is upstream of the page that was GET / POSTed from the browser. I think this is because by default errors go to stderr and not stdout. The HTML display is from stdout. That is, the PHP program "crashes" before there is any stdout.

Similarly, you won't see an error sometimes (even if display errors is on) unless you view source, because the error message may not be proper HTML. There is also a setting where errors are specifically HTML formatted, but, again, I've kinda lost track of this because I handle errors differently.

Anyhow, using CLI rather than web mode to "debug" is a minor point versus using a debugger for CLI, web, or ANYTHING ELSE. But, let me finish on the CLI versus web part, first. The reasons that it's often helpful to use CLI to "debug" are something like:

As above, display errors might be off (although you should turn it on), the error might not otherwise show in the HTML (stderr vs. stdout), or it might be embedded within HTML in a way that doesn't display. There are other reasons. Even if you're debugging, you're going back and forth to the browser, and just the browser being involved at all makes things slightly more complicated. I'll try to think about this. My guideline on this weeks ago was that it's often best to dev in CLI mode until you're ready to bring it into HTML. And it's often best to separate all the processing until you combine the HTML and PHP.

On a related point, don't forget /var/log/apache2/error.log or whatever you may have overridden error.log to. Sometimes very helpful PHP errors and warnings will show up in error.log. And, on a related point, when you turn display errors on, also turn on display E_ALL types of errors / warnings / notices. Occasionally you'll have to hide notices because they pop up in places that are not useful, but I've had to do that 3 - 4 times in many years.

So back to debugging with a debugger. In some cases seeing an error message is all you need. The debugger comes in for situations were it's not working but there are no messages. In other words, a debugger helps solve your logic errors more so than PHP language errors. I think you once asked about echo / print / console.log. Your code will get cluttered with more and more of such if you go that route. Then there is the problem of your code never getting to the output points. I had that happen the hard way when I was trying to debug Ruby without a debugger. I was getting a distorted picture of what was happening because my biggest problems bypassed my "print" (or whatever it is in Ruby) entirely. In a debugger, you're running the code line by line and know EXACTLY what is happening and where in the code.

In other news, I hope I can declare my XOR processing as good as it's going to get in any reasonable amount of dev time. One tentative conclusion is that mongo CLI / mongosh is not particularly efficient at output. Although it's not apples to apples because I was running that output as a "$ mongo" shell command, and running any command through shell_exec() or the equivalent is relatively slow. In any event, I went back to queries being done from PHP and outputted downstream from proc_open(). With 12 hyperthreads / "cores" (6 real cores X 2 for hyperthreading), my XOR processing takes about the same time as one core does to XOR the raw file. Although that is also not apples because I have an old computer versus Amazon's spiffy new ones.

I also did some buffering both to limit the number of cursor queries and the number of fwrite()s. It appears that my entire CPU capacity pegs for somewhere around one second. Perhaps that's the best one can get, as opposed to the CPUs waiting on RAM or disk. There were some indexes that help a lot. Of course those same indexes slow down inserts. I need to whittle down my indexes and try to find that balance.

The Mongo .explain() is helpful, although I'm not used to how it works versus relational. It was making some odd index choices that I had to curb. I simply deleted the index it was using and added another one that improved the situation dramatically.

Back to the email series. As I said, messing around with data types in advance in a database table is one reason I switched to MongoDB. In Mo, you simply chuck an array or object into the database.

The number in VARCHAR(30) is the maximum number of characters--in this case, 30--the field can hold. The number is up to you. How many bytes are in each character is a separate issue. In Mongo, you don't have to worry about this stuff. What is UNICODE up to now? Is it up to 8 bytes? Once upon a time, a character was a byte. The modern issue with VARCHAR and related data types is that one byte doesn't remotely cover all the characters in all the languages on earth. You need a set for Cyrillic, Mandarin, Hindi, Arabic, etc. You can restrict your database to the "original" characters, though.

This issue might have cost me a lot of money. I was once offered a 30 hour project, with possible extension. I was loaned a laptop with a working MySQL or MariaDB. It was enough years ago that it might have been either one of them. My KVM connectors and such were buried at the time. (I'm not sure if I could get at them now or not.) A laptop keyboard is almost useless to me. I couldn't get the database to load on my desktop. Given the nature of the restore program, the specific error was not easy to figure. I finally found out that it had to do with UNICODE. The other dev had left an email address as 255 characters. It was something that did not have to be anything close to 255 chars. He had an index on that field. His database was set to Latin-whatever-number-it-is or some such. By default, my database was set to multi-byte UNICODE, and I think it was 4 bytes at the time. The problem was that MySQL or MariaDB didn't allow an index on a field above something like 768 bytes. Not chars, but bytes. So it wasn't allowing the table creation with unique index. Now that I think about it, I didn't see the error because the import did not terminate. It just kept going after failing to load the test email addresses. It took me a while to set things up to see the error properly.

I had already been questioning whether I wanted to use relational again, and I wasn't sure I wanted to deal with Laravel, either. By the time I got the db loaded, I decided to walk away from the project. If he had given my access to the data file and not loaned me a laptop, I probably would have solved it. I kept going back and forth from desktop to laptop, though, and everything about the laptop was painfully slow. One problem was I didn't have a good place to put the laptop. It was a weird combination of events. It's one of those "What ifs?"

March 3, 2022 - commutative hashes / XOR by line (begin 18:45)

To quote Colonel Hannibal, "I love it when a plan comes together." (Although Captain Reynolds naked in the desert--"Yeah. That went well."--that might be yet better.) I suppose, while I'm at it, I should mourn the loss of General Hannibal of Carthage. The world would likely be a better place if Carthage had won, but I'm not sure they stood a chance under the circumstances. My vague memory is the world might have been a better place if Carthage had committed to total war. Hmmm.. There is discussion on the "What If?" and it is likely infinitely more honest than equivalent situations in the last 100 years. Might make for interesting reading. But I digress.

So I have been musing over this issue of quickly validating my web server access log database versus the source file(s). One problem is that all the hashes I can quickly find are linear in the sense of a VHS tape--they are only useful with one long string. They can't be parallelized. So I got the notion of XORing each line and then XORing the lines together. That is parallelizable. (That may not be a word, historically speaking. It is now.) I started in PHP. Right this moment, I don't remember why I went to C. At first I was concerned about 64 bit unsigned versus signed. That may have been it. PHP doesn't have an unsigned. Then I, dare I say, "circled back" and used signed in C. (I use "circle back" in mockery.)

In any event, I now have XOR working as planned in both C and PHP. (The code was in GitHub before I started writing this.) C and PHP match both forwards ("$ cat") and backwards ("$ tac"). This heavily implies they will match with the lines in any order--thus, parallelizable. C is 100+ times faster than PHP. I'm not sure I've ever known that the situation was quite that "bad." Our hardware is just stunning these days to create a situation such that I don't know that.

March 2, 2022 - .php versus .html (start 23:03)

My apprentice is about to bring MariaDB (MySQL fork) data into a web page. That is, he's going to try. I'm sure he'll get it fairly soon, but that's the sort of thing that doesn't go smoothly the first time you ever, ever do it. With that said, he may have accomplished that particular "Hello, world!" already.

His page was index.html. He couldn't get any PHP to work inside a .html file, which is not surprising. By default, Apache won't execute PHP inside a .html file. In almost every case I've seen, PHP runs from a .php file. I think there are a small number of extensions that will execute PHP. I *think* .tphp or .phpt will work, where that's a PHP template, but I'm pushing my knowledge / memory. Remember that included (required) files are already in the PHP context or they won't work, so you could call an include file anything, although I can think of very few reasons not to stick with .php. That is, an include file is only meaningful if it's coming from a PHP context, so there must be a .php file somewhere down that call stack. Similarly, if PHP is file_get[ting]_contents() a file, it can be called anything.

So the short answer is change it to .php. I have never done anything different. You asked about speed. I'll address that further below. There are some interesting issues around that, but they are all very, very minor relative to what you care about right now.

He asked whether it is possible to run PHP from .html. I'm fairly sure you can override something in Apache, but I don't think I've ever seen it done.

With that said, some sites suppress .php. I just tested and found that one of the systems that shall not be named will let you do example.com/index.php . I find that the most "popular" system that shall not be named will not. ("WordPest" is not strong enough.) It will redirect (301) /index.php to / I think the redirect is done within the bowels of the application and not in the Apache .htaccess, but I'm not entirely sure because I'm not fluent in rewrite rules. To the extent I understand them, I put it at a 70% chance that it's done in the "bowels."

I'll call the "popular" one WordPlague. Word pest, word pestilence, Word Plague, or simply The Plague. That works. I'll call the other one Stoopal because it will cause your development skills to stoop until you're dependent on the cane / crutch of it. You'll never be able to walk properly or run after too much use.

In any event, as best I understand without digging too much, often you don't see ".php" because all of the URLs are written with the assumption of a "single page" application and / or other redirects / rewrites. In both the cases of The Plague and Stoopal, most HTTP requests are rewritten and sent through index.php. index.php in turn starts a series of includes and conditional includes that processes the request. This is a case where looking at the superglobals in a debugger would explain a lot, but I'm not sure I can bring myself to care much about the content manglement (sic) systems.

You asked about a speed difference. That is not worth a second of thought or hesitation. I will delay renaming .html to .php until I'm sure using PHP directly in the file is the right answer. That's a matter of clarity, though. It doesn't make sense to use .php unless you're specifically using PHP.

Months ago, I might have leaned towards .php to leave myself the flexibility, but I'm getting better at rewrite rules, so I'm not as worried about that. In fact, I did a rewrite from .html to .php recently. It's very profound, isn't it? Note that using .htaccess files in various ways requires a specific "AllowOverride" in the virtual host's .conf file. All that stuff is in my sysadmin repo. If the machine in question is yours, I don't see a problem with very loose AllowOverride. The potential problems come with shared hosting where you are the host.

To do a fair test of a speed difference, you would want to run precisely the same content with .php and .html. You can set your /var/log/apache2/access.log to show microseconds. I've done it, as should not be surprising. The example is in my sysadmin repo. If you call a page twice using the same method, it would give you an idea. Or call it 1,000 times.

I should add that I've given some thought to turning microseconds back off. I was hoping they would provide a unique line identifier, but they do not. A 408 HTTP response (request timeout) will show up twice in the same µs. As best I understand, a 408 is Apache not-so-politely telling a browser to shove off and disconnect. That's all deep inside Apache. I'd have to dig to start to understand the context. In any event, those two lines are completely identical including the microsecond.

In any event, I'm not sure you would detect a time difference. If you did, it would be way too small for human detection. It's just not a consideration. There may be situations where you want to load some of your data with AJAX after the page has loaded, but that's a mostly separate issue. You're asking about very similar files as .php versus .html. You're also asking about tiny amounts of data pulling from only slightly larger data into a page. I see no problem pulling it directly from a speed perspective.

I have never tried to detect those sorts of speed differences and have not worried about that. There is an interesting consideration, though, between .html and .php that has been in the back of my mind for a very long time. When Apache serves an .html file, the Date-Modified in the HTTP reply header is the date of the file modification, and the ETag is based on the contents. That makes it easy for Apache to serve up a 304 response. 304 means that the browser sent an "If-Modified-Since" and / or "If-None-Match" the ETag. The browser is reporting what it has in its cache. Apache serves the document if it has changed or else send back "304 Not Modified.

Contrariwise, if the exact same content were in a .php versus a .html, I don't think the .php has an ETag automatically, and the Date-Modified will be "now." If you want to generate those properly and serve "If-None-Match" and such properly, you have to do it yourself in PHP. I have dug at that a little bit in kwutils.php, but I haven't fleshed it all out.

If it's a borderline case between .html and .php, I consider it obnoxious to leave off those features. It's been one of my hesitations in creeping towards a "single page" website. I see the merit in single page. That is not one of my issues with manglement (sic) systems.

more immediate issues

I'll try to wind this down. I'll remind you again that there are whole sections of this blog--perhaps 70%, perhaps more--that I don't expect to make sense right now. I'm recording thoughts for the long term. I'm not sure there is a point in reading it over and over. Perhaps once every 6 - 10 weeks at the rate you're going. Your rate is fine; I just don't want to waste your time. Perhaps we should make an interactive application to help you track where you are with each subject.

As for your comment, "I feel like it's gonna start coming to me really fast, really soon." It has been my guess that there are parts of the learning curve that are exponential or perhaps x^3. But then you are pretty much guaranteed to be banging your head on various new issues, unless you ask for help (hint hint). Yes, I think quite a few things will start to make sense soon.

March 1, 2022 - continuing from yesterday (starting 21:07)

The credit card processing is now a very few steps from live. One big accomplishment is that I have for the first time used Apache virtual hosts exactly how they were meant to be used. I've used the "VirtualHost" tag a zillion times, but I've never set the same machine (and IP address) to serve two domains. I've never had a reason to before. It's sort-of-kind-of as easy as Apache's very first example. I give some "gotchas" below. One criticism I would make of their example is that www. should work via the "*" DNS A record. There should be no mention of www anywhere in DNS or Apache config or anywhere else. That's so 1990s. Oh wait... It's not that simple:

I would about to say that www.kwynn.com works just fine, but given my recent http to https redirect, it does not work just fine. I have the RedirectRule set to preserve the www or anything else. The routing works fine, but given that "www.kwynn.com" does not exactly match "kwynn.com," the security cert process rejects it. I have a solution for that!

/etc/apache2/sites-available/000-default.conf  
# old version: 
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,NE]
# New version: 
RewriteRule ^ https://kwynn.com%{REQUEST_URI} [L,NE]
# end change

sudo apachectl configtest
# Syntax OK
sudo apachectl graceful

Then it works. Ok, with that said, I repeat my statement. "www" is so 1990s. STOP USING IT! STOP REFERENCING IT!

With that said, my client's site still has multiple references to www. I vaguely remember this coming up years ago. At the time, I was too inexperienced and /or chicken to deal with it severely. Right this moment, I don't want to confuse issues. I want the credit card code to be fully implemented. Then one day I will deal with extreme prejudice with "www."

I started the process of VirtualHost "B" by copying a .conf file. I removed the security references to the exact site, but it turns out that leaving the "Include /etc/letsencrypt/options-ssl-apache.conf" was a bad idea when I was just setting up site B. It passed configtest, but it brought down both sites when I activated it and restarted Apache. So no references to security until you get farther down the path.

So, to get VirtualHosts working for both sites A and B, start with a stripped-down .conf file for the new site B. The following worked. Note that before you run certbot, the new site B has to work with http (see gotcha below). That is, the "certbot" process has to be able to access a non-secure version to verify ownership. By "working" that means in part you have your DNS A and / or AAAA records pointed to the right IP address.

ubuntu@ip...:/etc/apache2/sites-available$ more payt1.conf
# the following should be the result once created / edited:

<VirtualHost *:80>

        DocumentRoot  /home/ubuntu/payments_www
        <Directory /home/ubuntu/payments_www>
                DirectoryIndex index.php index.html
                Require all granted
                AllowOverride All
        </Directory>

        ErrorLog ${APACHE_LOG_DIR}/error.log
        CustomLog ${APACHE_LOG_DIR}/access.log combined

ServerName payments.example.com
ServerAdmin buessresume@gmail.com
</VirtualHost>
# end file

sudo a2ensite payt1.conf
sudo apachectl configtest
sudo apachectl graceful
sudo certbot certonly --apache --dry-run

# for real:
sudo certbot --apache

# it's fine to let certbot do the rewrite / redirect from http to https
# that last command does the a2ensite and restarts apache

After that, there is one last annoyance. It's not really a "Gotcha!" but it acts like one, and you could do a lot of wheel spinning if you don't realize what's happening. Now that I think about it, I'm pretty sure the only annoyance comes if you are testing the non-secure http when any https sites are involved. Firefox and presumably other browsers will be assuming a previous 301 permanent redirect from a given IP address to a URL. If you try to go to the non-secure site B and maybe the secure site B, you'll get redirected to the original site A. The solution may or may not be as simple as going directly to the https URL of the new site B. My solution was to use Brave instead of Firefox. I hadn't been to either site in a long while on Brave. After 10 - 20 minutes, the https version of site B worked fine in Firefox, but I'm not sure if that was due to some sort of timeout / expiration or because I went to the secure URL.

Next is another solution for 301 problems. Note that in the Control - Shift - I "Network" tab you can see the 301 redirect. Anyhow, another solution in Firefox: go to history, then hover over the site in question, then right click and "Forget About This Site." You may have to do that for several variants of the site. I don't think that messes with your saved password, but I'm not entirely sure, so keep that in mind.

Feb 28, 2022 - a day or a few hours in the life (starting 17:59; update starting 18:25)

Today / very soon I hope to get back to my main client's credit card processing. That is the main client whose project has always been part-time. I still need more work. I have one source that's looking promising for a bit of work, but it will only be a bit. I've also been getting nibbles for fairly big projects, but we'll see what happens.

For years my client has used the same credit card processor. (I only got involved with that side of the project very recently.) Apparently it has worked fine, so that's much of what matters. They are not a big name, though, and their documentation is not public. Non-public documentation is generally enough reason in my mind not to use a provider. In fact, I was heading towards suggesting this to my client. He created a login for me, though, and the documentation behind the login wall was quite reasonable. Their PHP example was straightforward and worked right out of the box, which is more than I can say for many such situations.

I have whittled on the example and am approaching an end to that phase.  "Whittled" means I confirmed that I can assume a US billing address and whole-dollars only (no cents).  I removed the fax number field (!?!?).  And I'm otherwise cleaning up their code.  It was plenty good enough as an example, but I want it to be cleaner.

The immediate task is to improve their final result handling.  The example simply dumps 1,000 characters of raw XML, which is not exactly comforting or useful to shod muggles.

I have never worked on the site that will use the credit card processor. The site is in the much-despised Drupal. I more-or-less flatly refused to touch Drupal anymore. What I did is took one of his pages and saved it to disk. Then I whittled out all the stuff specific to that page so that I could turn it into the payment page(s). I cleaned up the HTML to some degree, but it was deemed not worth the time to clean it completely. My plan is to put the form on a subdomain. I made my "new" page pixel-perfect such that no one will notice from the page itself that they have moved to a subdomain. Given that it's a subdomain, I can stay away from damned Drupal.

I should also mention that the example and thus working version is a different technique than my PayPal form. The PayPal form is meant to be very "simple" to implement with front-end JavaScript only. It accomplishes that, but I don't really like that format. Neither does my client. In the case I'm working on now, everything is on the back end, so I have much more and easier control. I am almost certain PayPal can do it that way, but at the time I was trying to do it the quick way. Given that their sandbox was having major problems, though, I probably would have been just as well off to do it in the back end. Even if the sanbox was having problems, the nature of the problem would have been clearer from the back end, too.

update, starting 18:25

For years my client has been using a 3rd party's form to interact with the credit card processor. When he sends people the link to the form, though, it's usually eaten as spam. I suggested linking to it from his website, but he said that it's dated anyhow, and he wants the form(s) redone. When I looked at the existing form, I was having a lot of trouble figuring out where the money went--who are these people? Obviously it's going to him, but it shouldn't be that hard to figure out from the outside. I couldn't figure out who they were, let alone find documentation. That's when I suggested using someone else. That's also when he said he doesn't like PayPal's "simple" form. And that's when he made me account that I found very useful in making progress.

Even though their lack of public doc annoys me, I was about to link to them here. Their system has worked for years, their example is quite good, and their documentation is somewhere between plenty good enough and near perfect. When I went into the login system, though, I STILL had trouble finding their public site. Someone in fact has registered a similar domain that outranks the real one on Big Evil Goo; the fake one is a God-knows-what probable mischief site. The second result is very likely the real site, but they aren't even using an SSL cert. Given that even I finally started using an SSL cert some time ago, and then I finally did "the redirect" a few weeks ago, I'm going to discriminate. (I chickened out and did the 302 redirect so far. I suppose the 301 is in order very soon, given that the 302 shows every sign of working.)

So I'm going back in part to my original position. I would imagine they will continue to work fine long enough, but I have gotten the impression from the start that it's a dying product. They are supporting current customers, but they aren't looking for more under that brand. I have some loose indication that they are associated with Wells Fargo. Maybe they got bought out and re-branded.

Feb 22, 2022, starting 18:33

The madness has been on me for several days, although both my moon page and my eyes tell me the moon is days (waning) away from full, so presumably the madness is neither lunacy nor lycanthropy. The madness meaning that I am obsessed with various tasks of my pet, personal, non-paid projects.

So my web log loader loads efficiently, and in the last few days the verification is working more and more efficiently. I decided, though, that a 2 - 3 second delay due to ssh making the connection is not good enough. I also decided against a web app because there would be all sorts of permissions mucking between the www-data Apache user and the "adm" group that can get into /var/log/apache2. In the past I solved web access with a tiny C program that uses setuid() / the setuid bit. In that case I wanted C to call a very simple application with no parameters or complexity. This would involve empowering a relatively complex program, so I decided against web access. Also, given that I'm wanting to whittle down from a few seconds, the web path would cause its own delays.

So I have created a dedicated background PHP program / process that listens on a highly secret port number (that is sitting right there on my public repo). I minimize RAM with an input buffer and proc_open() into "$ openssl md4" I am using md4 because a combination of web searching and testing demonstrated it's fastest. In other words, the program sits there already running with the log file already open waiting to md4 hash a section of the web log to compare against what's in my local database.

While I was at it, I refined my password hash interface to more easily create passwords and hashes. The socket listener authenticates the request with 0.02 seconds of password hashing. I want to spend minimal time on that, but I want it to be some time to turn cracking the password into an age-of-the-universe level task.

In short, it's pretty spiffy. I think it's done, but I haven't set it live yet. On second thought, I can probably live with 3 seconds. The app I created is a good model for future "listeners," so it was helpful and fun even if I never deploy it.

Several days ago I set the verification process to fork such that the database md4 and file md4 happen at the same time. I created a new forking class for this purpose that is more specific and thus simpler than the more general fork process in kwutils.

On my part-time, few-hours-a-week paid project, I've had a productive few days in terms of efficient coding. The timecard specification has plagued me. It seems that I now have complex logic whittled down to about 11 lines. It only took me 5 years of intermittent work to achieve that. The overall phase of that project is that I am partially rewriting the application to extract it from Drupal. A quick search says that I've talked about this to some degree, going back to last July.

Only extracting from Drupal, though, I have deemed not good enough. I spent a lot of time making sure my save-on-edit (keystroke) process was totally reliable. When I started dev'ing again, I immediately got 2 cases of false positives--a save indication when there was none. That makes a certain amount of sense given how drastically I have changed things, but I decided it wasn't good enough, so I have dug into a deeper rewrite, which of course led yet deeper. I should have vastly better code when I'm done.

Between what I call Brook's Second Law and Raymond's Corollary, plan to keep rewriting it, because you'll keep rewriting whether you plan on it or not (Brook's [First] Law). What I call the Second Law involves rewriting. What I call Raymond's Corollary is in his classic Cathedral and the Bazaar.

Feb 19, starting 18:52

Going back to my previous entry, the sequence process I had would work fine assuming that the sequence existed before multiple processes would start using it. Last night I put more code in GitHub where I trivially demonstrated the potential problem. If the sequence did not exist before a heavy multi-process load, there would be many failures.

I am usually referring to codename "recent CS grad" apprentice. Two nights ago "sales guy" apprentice read my previous entry. In answer to some of his questions, and to elaborate on this issue:

A real-world example is how to generate a sequential invoice number (1, 2, 3, ...) when there are multiple users who might hit "buy" at the same time. Some physical few bits of some computer somewhere must be dedicated to keeping track of which number the sequence is on. But it gets more complicated than that. The operation to check and then add to that number must be "atomic," or it must be a "transaction" in the database technical sense. An atomic operation is one that breaks or makes a mess if it does not complete all its steps. In this case, the operation is checking the existing number and then adding one. If one process checks and then another process checks and adds before the first one adds, then two customers will get the same number.

At some point, sitting on a toilet (under some conditions) becomes atomic. You have to finish the cleanup step, or you have a mess. In the data world, the data is either wrong or otherwise corrupted if an atomic operation fails and does not "roll back."

If you go looking, the classic example of an atomic operation / transaction is when a husband and wife with a joint account show up at two bank branches at the same time. If the timing of checking the account balance versus giving them their cash is precisely wrong, they overdraw their account. (Look it up.)

My code last night demonstrated the equivalent problem with a brand new sequence. I have a bit of code where if I only do the loop body once, the sequence can fail. If I loop twice, then one of the two iterations is guaranteed to work, and my code bears that out.

Some time perhaps I'll elaborate on this. The point for my geeky purposes is that I finally have a sequence process that will work under all reasonable conditions. As I said in my previous entry, my musing over this issue has led me to all sorts of fun and games. I might have saved myself some time if I had done last night's code roughly two years ago. Then again, I did some cool stuff along the way.

Getting back to the real-world problem, note that Amazon makes absolutely no attempt to give their customers sequential numbers starting (decades ago) with 1. A quick search shows that as of several weeks ago, Amazon does almost 19 orders per second. We are at 1.6+ billion seconds and counting into the UNIX Epoch (seconds since the start of 1970 UTC). That's a 10 digit number. Even if Amazon had done 19 orders a second going back decades before it existed, that's an 11 digit number. As of March, 2021, I have an Amazon receipt with a 17 digit order number.

The format is 3 numerical digits, then 7, then another 7. The first one might be a server number. It would make sense that the individual server would be preventing duplicates.

Feb 18, starting 01:21

I got my PayPal donation page working. For most of my work "day," Paypal's systems were working fine. Based on their system status, it seems they were having problems late on the 16th which caused me problems. Later in my day, the sandbox system died again, so I stopped messing with it. It's working fine.

Today my bigger irritant with PayPal is that it seems very difficult to correlate the "Transaction ID" with anything you can access "programmatically." The transaction ID is what both the payer and payee see, but it's hard to get at. I thought the difficulty was only because I'm using the simple client-side JavaScript button widget. Upon research, though, it's not necessarily easy to correlate with the full API.

As I investigated the API, it was hard to tell again whether PayPal was having problems or I wasn't getting my queries right. There is a transaction history function / API call, but I was either having difficulty formatting the time right, or there is a long delay until the data is available, or the sandbox was having problems.

I did some experiments with an "invoice ID." Both the payer and payee see that, too. I got that working (in the test version) in that I could create an invoice ID, but then there is the question of uniqueness. That question has been plaguing me in various forms for something like 2 years now.

So, my latest sermon on unique IDs... Most of the IDs I've been exploring lately are absurdly too long for an invoice ID; they are meant to be unique in all of time and space. I want the ID to be intelligible to my customers. So that brings me back to sequences (1, 2, 3, ...).

It was in part my uncertainty over getting sequences out of MongoDB that set me on the path towards my PHP extension to get a unique ID from time, the CPU clock's tick since boot, and the CPU (core / hyperthread) number.

So, after all this time, I decided to revisit MongoDB sequences. My fork class is easier and easier to use, so that means I can fairly easily set any task to use all my CPUs / cores / hyperthreads. So the setup is to set all the CPUs "demanding" a sequence at once, in a very tight loop. In my "code fragments" repo, I have posted the code.

MongoDB's instructions (indirectly) on sequences are somewhat vague, but now I can say with a reasonable degree of certainty that they work. MongoDB went to 500% CPU usage (6 of my 12 hyperthreads: 6 cores X 2 for hyperthreading). My 12 PHP threads divided the rest of the CPU usage. That was MongoDB doing a lot of work resolving locking issues. I demonstrated again that it takes a long time to resolve locking issues. That is, a sequence is not an option if the system is going to be banged on. However, if I asked for 950,000 sequence calls, the sequence at the end was precisely 950,000. (I started the sequence with 0 internally; the first call would get a 1.)

When I just "asked" Mongo for the sequence, that took much longer than actually writing rows with Mongo's default _id or my nanopk(). I will try to remember that I can "objectify" my array and use that directly as an _id. I'm almost certain that arrays aren't allowed. I suppose in future editions of nanopk() I should see how hard it is to return an object.

Feb 16, 2022 (starting early 17th)

I was battling PayPal's simple, few-lines-of-JavaScript button for about 5 hours. I first wrote 8 hours because it feels like that and more. Then I went back to look at a discussion I was having, so 5 hours is correct.

I'll come back to my shrieking about PayPal. I got BitCoin and Ethereum working, I hope.

As for PayPal, I was having problems in sandbox mode. I hope that sandbox mode is far more problematic than live. The problem I was having was that the popup window kept crashing / closing. There was a 500 error, but that was rather useless for diagnosis. Even when it worked, various steps could take something like 40 seconds. The solutions, in approximate order of importance, were something like:

Here is my release candidate code. Hopefully I'll go live in around 16 hours.

Feb 13, 2022 (starting 18:58, PS started 19:55) - Zoom

I see an ad for someone who wants to learn a given dev language. It's not one that I've done, but I've done 8 professionally and two others without pay, so I can probably help. I tell him just that, and I ask for some source code, so that I can get a feel of the situation. He mentioned he had a bug, so I figured the place to start might be to solve the bug. Given that he's just learning, I can't imagine that the code is sensitive.

So I ask for source code, and I get back precisely, "Can we do a zoom [sic] session to discuss?" This is an example of why I shouldn't be gig hunting without help. Here is the ultra sarcastic version of my response. I have no intention of sending it to the unlikely to-be client.

I understand that the ad was for teaching, but I need to deal with the objective part first. I'd like to see some relevant snippets of the language first. Software dev is not improv; it's not live stage acting. Software dev in itself is not real time like driving. I like time to think about what I'm doing. There is simply not much to say to this person until I see some code and decide if I want to go down the path.

Furthermore, what the hell is it with Zoom? I suppose I'm going go off the "professional" reservation, but the whole world is off the reservation. I'd never heard of Zoom until Billuminati Gates' Satanic Crusade. I can't get excited about the crime of war profiteering because arguably no wars fought by the US after 1815 were legitimate. I was trying to find historical instances of executions for such, but the closest I found off hand was 17 years in prison for David H. Books of DHB Industries, who died in prison in Danbury, CT in 2016. He sold ineffective vests that were not bulletproof. (Oh, of course, the $10M party was a bat mitzvah party. See my personal blog for "the preface" on that.)

Twitter and YouTube and company are so fond of their "misinformation" disclaimers. If Zoom wanted to make the following disclaimer, that would be one thing:

After due process of law, we deem it near certain that Dr. Fauci would be convicted of multiple capital offenses and possibly executed. Until that time, however, we will try to help keep life going with our service.

If they did that, I would probably be satisfied. Off hand, I see no such statement on their home page. Surprisingly, I don't see masks, so that's something, but not enough. The point being that I see Zoom as "Covid" profiteers, and I do get excited about that.

Also, Zoom goes in a similar category to SatanSoft and Satan's Operating System even before SatanSoft was obviously funding a war against humanity. Why on earth do people install proprietary software when there are free and open source alternatives? Even if you were using Zoom 3 years ago, why would you do that?

I asked you for source code. Until I have source code and have studied it and debugged it, no, I don't want to go live to discuss, and I certainly don't want to go live on Zoom.

possible solutions

To my apprentice, no, lurking in the background won't do. As I said, I am not going live until I get what I asked for. One potential solution is that you contact him and tell him we're connected. You can call me autistic or Asperger's or anti-social or shy or even a crazy conspiracy theorist. You can call me quirky and touchy and eccentric or even an asshole. You can say that I am incapable of politely asking for source code AGAIN, so you are politely asking for source code.

Another option is you do what you did with Flutter. At least this time we have a response. You don't have to mention his response to me; the point is this guy is responding. See what you make of the language. Possibly install a debugger. Then see if he'll talk to you. You can make a similar arugment to me in that you've been at it in a number of languages for a very long time. I'm not necessarily encouraging you to take that risk, although as I said in the email to him, R looks like it's worth learning. I've responded to perhaps 5 R ads in the last year or two, but I think this is my first response.

PS - 19:55

I mentioned a checklist. In some contexts, that would be useful. In this case, I've already made my checklist. I want some blanky blank source code. That's the one item on my list.

Feb 5, 2022 (starting 22:57; posting 23:45)

That may be some of the more satisfying dev I've ever done. The code is in GitHub. For reasons that would depress this entry, I corrupted my local database of my web server access logs, going back to late October. When I reloaded the file, it had become 400MB and 1.7 million lines. My loading program ate 2 - 4 GB RAM and was swapping and was depressing to watch the "top" results of and took 5 - 10 minutes. Now it takes 6 seconds and consumes minimal memory. I am not quite comparing apples to apples, but I'll get there.

I'm not quite comparing apples to apples because the old version did the line parsing--separated the parts of a line and give them digital meaning where appropriate. The new version only chops up the file into lines and puts them in a database with enough line number information to reconstruct the line numbers later.

The old version loaded the whole file into memory, so that was one problem. The new version sets as many cores / hyperthreads as one has digging into the file at the same time. The processes are reading the file line by line, so I am not loading multiple lines into memory. More precisely, I may load parts of two lines into memory, but that's it.

Another large improvement was to abstract "insert one" into "insert many." I learned that trick over 20 years ago, and it still works. If you insert a few hundred rows in an array with one command, that is tremendously faster than inserting hundreds of rows with individual insert commands. I created an "inonebuf"--insert one with buffering--such that the user can queue individual rows, and the buffering and inserting is done for them.

I created classes to simplify forking (multi-process / multi-core) perhaps 2 years ago. Now I've put those in kwutils (in the general sense, not the file). They work splendidly as written years ago.

Feb 3, 2022 (starting 03:32)

The moon phases are now in the user's / browser's local time.

Feb 3, 2022 (starting 01:01)

In answer to my disappearance over the last several hours, the lunacy took me again. Another apprentice came online who is in another timezone. So, my fun tonight started out considering adding a timezone offset from JavaScript (the user's browser) so that he would get times in his local timezone. This led to separating calculation from data acquisition (from the Python ephemeris) and storage. That led to cleaning up the various "magic numbers" where I'm trying to make sure to always have enough days of data. I also took my advice to use AJAX: I created a data buffer so that any given user is unlikely to suffer a delay when the almanac (ephemeris) loads. The almanac takes about 0.5s to load, so after every call to the app, there is an async call that makes sure there are plenty of days in the database. The user won't see any of that unless they are using dev tools, although they might see a spin somewhat away from my HTML doc. Do browsers show any spin (spinners) under those conditions? I don't know. I'll see what it looks like over time.

The alternative is to set a cron job. I might do that.

Anyhow, when I had everything working locally, I broke the live system for roughly 15 minutes. A violation of Rule #2 burned me. Locally I'm running MongoDB 4.4. (I can't run 5.0 because my CPU is too old.) Kwynn.com is running v3.6.

The violation of Rule #2 led to my having to seriously bend Rule #1. One can run a debugger remotely, but it strikes me as a bad idea. So I had to go do what is generally a gross violation of Rule #1. My current code in GitHub still has a file_put_contents() on line 29. The error was on the current calc.php (not data) line 41. $ala[0] did not exist. I did the file_put to see the data. The algorithm assumes that the data is in earliest to latest timestamps. It would appear that locally MongoDB was sorting in the order the data was put in the database, which is what I wanted. I could not at a glance figure out what the earlier MongoDB version was doing, but it wasn't in the order I needed. So the solution was to add an explicit sort, which is what I should have done in the first place.

The code is somewhat cleaner now, but I'm not sure that whole exercise was worth it.

February 2, 2022

I have posted my working lunar calendar.

You said, "I desperately need to start adding repos to GitHub in case these clients want to actually look at it." I'm not sure anyone in that context has looked at my GitHub, despite my mentioning it a number of times. It seems the people I most wanted to look didn't respond at all. I doubt that's because they looked at my GitHub and then decided not to respond, although I can't be sure.

I have learned over decades to be very careful what I do for that sort of reason. If you are working on something that does not absolutely have to be secret, then post it publicly to GitHub. Why not? *Maybe* someone will look at it one day. It serves as a backup and version control if nothing else. The effort to go backwards in time to post stuff should be nowhere near "desperate," though.

I have found GitHub to be very motivating, and it continues to be motivating even though it doesn't seem to have served the purpose you mentioned. When I have posted to GitHub, I have "published" in the academic sense, even if there is no peer review and even if no one ever looks at it. It's on the record. I also like the idea of SatanSoft paying for it. If they want to host my code for free, I'll take them up on it.

January 31, 2022 - Earth counters Luna (02:07)

Going back to yesterday's moon entry, I have begun my counterattack against the moon. There is nothing live yet because it's all back end so far. I have the Python ephemeris working. I feed Python output (painfully) into PHP and load it into the database for better sorting and querying. I have all the data I thought I had earlier. This time I've tested it down to the minute, several weeks into the future. I have a notes file with some installation notes.

In Python, SkyField deals with something called NumPy (?) that seems to have a problem serializing a simple array. Specifically, I can't simply json.dump() it. So I preg_match_all() in PHP, which is an obnoxious way to do business. It seemed faster than decoding NumPy, though.

There is a noticeable calculation time with the almanac SkyField function; less than a second, but noticeable. That's not a complaint; I'm starting to get some notion of how complicated that calculation is. That's one reason I'm saving the result in the database.

I'll eventually write code to tell JavaScript when Python should be called again for data days enough in the future. I'm starting to think the easiest way to do asynchronous calls is not surprisingly with Asynchronous JavaScript and XML (AJAX). The alternative is to try to exec() in the background, but weird things generally happen when I do that--either directly or the debugger won't work. One day maybe I'll figure that issue out in PHP. Anyhow, when the data goes "down" to JS, the JS is "told" if it should do an AJAX query back to PHP to update the database. That's another weird way to do business, but, again, it's called async for a reason.

January 30, 2022 - when to collect from a client

This is the 3rd topic today. Should I have a separate blog for the business of technology? Mostly what I'd have to say is what not to do.

Financially and perhaps more importantly psychologically, for a project of any substance, I need some payments before it's done. The project needs to be divided into benchmarks where some aspect of it works--a proof of competence. Also, we can use an escrow service. Escrow.com charges a lot less than what I saw years ago. They have an "A" rating with BBB, last I checked. An escrow service would help, but I would still need interim payments.

Put another way, bill early and often in small increments.

I can go on about this at some length--both historically and for the future. I'll wait for your reply. How big a problem is this for you?

January 30, 2022, one of several entries today - new topic

I had never thought to check, but the "mysql" command is a symbolic link to "mariadb" Yes, you are correct, I should start referring to the mariadb command line as such. I'm sure I'm going to slip, though, and refer to "MySQL." It should be understood that I always mean MariaDB.

Did you set the MarDB password with sudo mariadb? You must have. The logic may have changed over the years, or it might not have. Apparently once you set a MarDB root user password, you have to use that password even if you're Linux root. Maybe you've come across something I haven't, but that seems it could lead to trouble. Is there an override as Linux root? You might consider unsetting the MarDB root password and using specific MarDB users for each database or several databases. I have done it both ways over time. When I set a MarDB root password, I save it in a file only accessible by Linux root in the Linux root user's home directory. I do not have a MarDB root password set on my dev machine. I created a user for "main project."

January 30, 2022 - return to the moon

part 1 - when I still thought it worked or was about to work

That would be returning to the moon in an anti-gravity ship, not that rocket silliness. To quote the Cthaeh, "Such silliness."

The moon phase UNICODE characters didn't show up that night because I hadn't gotten that far yet. It was almost certainly not a mobile issue.

In the following, I'm much more making fun of myself, not you. You asked about "hundredth millionths of a second" precision. Given that I have spent an enormous amount of time on time (sic), I will seriously address your question. I realize you probably meant the point hyperbolically.

Yes, the count from 0 to 1 is in real time. No, it's not displaying "hundredth millionths of a second" precision. It's not that precise for several reasons. For one, pop quiz: what is the refresh interval? The numbers are not blurring to the eye. I wrote the refresh interval specifically not to blur. I want to user to immediately see that the calculation is in motion, but I don't want it to be blurring. The refresh interval is 7 orders of decimal magnitude from what you said.

To continue the pop quiz, what is the unit of the numerical argument to the function in question--the "refresh" function? That's 5 orders of magnitude. If you send an input below 1 unit, will it make any difference, or will the unit be interpreted as an integer? That's a very good question. I don't know the answer.

Also, 1 represents a lunar month. I am displaying 8 digits to the right. If I have the math right, that's displaying down to increments of 25ms out of the lunar month (7 orders of magnitude), but the refresh interval is several times higher than that.

I'm pretty sure the answer is the precision is equal to the refresh interval, but I may have lost track of the question.

Then there is the question of whether I could keep track of hundreds of millionths if I wanted to. In a compiled language, I am fairly sure the raw numbers in the CPU would keep up with real time to that precision, but by the time it was displayed in any manner, it would be several orders of magnitude off. If it were a fancy display using a GPU, then it would come down to that max refresh interval. I understand those are getting faster and faster, but when do they stop and conclude the human eye can't perceive anymore? Or are they going beyond that as supplemental CPUs? I guess the latter, but I'd have to dig.

part 2 - such silliness indeed

Hopefully I am not under the influence of the Cthaeh. On one hand, he would do much worse. On the other hand, it's hard to say what the effects will be.

After mucking about way too much, I did a check against a reliable source and realized that the "lunation" of the moon is rather complicated. I had assumed it was close enough to linear only in the sense that earth days and years are constant down to an absurd number of decimal places. (There are leap seconds every few years.) I thought the lunation period was the same. Silly me. Such silliness. "...the actual time between lunations may vary from about 29.18 to about 29.93 days" (WikiP). Any time I cite the WikiP I must warn that quite a few of its entries in absolute numbers, if not relative, will get you killed if you believe them. I most certainly checked against another source.

As for your second email on the subject, hours ago: like many applications, there is the external goal that you are trying to calculate or track or assist with, and then there is the implemenation. In answer to one of your points, the "technical part" in terms of the orbital mechanics went over my head, too. Perhaps worse yet, I made a silly assumption and didn't even think about or look into the orbital mechanics. With that said, there are probably some decent tricks in my code such that once I have valid input, the processing of that input had some merit.

When and if I ever have valid input, Rob Dawson's "Moon/Planet Phases in JavaScript" code looks very interesting. It would be more realistic than my using a handful of UNICODE characters and opacity CSS.

As for getting valid input, I will hold my nose and grudgingly admit that this is one of a handful of situations where the best code is in, ..., rrrrrrr..., I don't want to type it...... Python. Not too deep in the bowels of my "code-fragments" repo, there is a Python script that uses a powerful ephemeris library.

I would likely write everything I can in JavaScript and PHP and then call Python with the exact data in the exact format that the library needs.

January 29, 2022 - howl-dev-ing at the moon

It became critical to know when I turn into a werewolf. Although, come to think of it, I didn't add that feature. Hmmm... We need a definition. One becomes a werewolf when the moon closest to full rises at one's location? That will take more work involving location--fairly easy--and location-specific moon calculations, probably not as easy, perhaps not so hard, either.

The whole disk of the moon rises, I guess? Seems that a sliver isn't enough.

It should be emphasized that the Quileutes are not werewolves. Edward pointed out to Caius that it was the middle of the day. This fact surprised Jacob. (I don't remember if that fact made the movie.) Me, read the book? Nah.

Anyhow, I worked on it for about 2 hours after you signed off. The version at that point has 7 different UNICODE moon symbols with the sky going dark at new moon and brighter towards full moon. It's only 3 columns now--date, UNICODE character on a background of varying blackness, and the 2-word description of the phase--waning crescent, waning [and close to] new, waxing crescent, etc.

Oh yes, here is the live version, and here is the snapshot of that code. Just after linking to that version, I moved the style tags back into the HTML rather than an external CSS. I'll get around to deleting the CSS eventually. Then I made a few more small fixes.

This is one main way to build HTML elements--by creating them as objects in JavaScript and appending them to the preexisting document. Note that cree() is defined in /opt/kwynn/js/utils.js. I mentioned this weeks ago in the context of various ways to handle MVC and 4 layers of code. The other way is to write the HTML directly in PHP, keeping HEREDOC string format in mind. They each have their cases and merits.

In this case, everything is (client-side) JavaScript. If you were building HTML in JS starting from a database, one method goes something like the following (all source code):

<script> <?php	echo("\t" . 'var KW_G_CHM_INIT  = ' . json_encode($KW_G_TIMEA    ) . ';' . "\n" );  </script>

A possibly better example would be when you init a larger array that the JavaScript cycles through and creates a row for each member of the array. You still init it the same way as above.

January 28, 2022 (first entry 17:14, entry 2: 17:52)

Entry 2: Note to self regarding my goaround with OAUTH2 recently: a rewrite rule would partially solve the XDEBUG flag from Google when it sends an OAUTH code. That code expires in a matter of tens of ms or less, though, so if the debugger stops processing before a certain point, the code won't work. At least, in my reading I learned that a system clock being off by 5 ms could invalidate the code. So maybe there IS a practical reason to keep good time; imagine that.

Yes, setting up an environment--databases, libraries, debugger, etc.--is painful the first few times you do it, and even then it can be painful. Helping with that is one my jobs, so it's best not to do such things while I'm asleep.

Otherwise put, expect such things to be a pain for the first several times you do them. Take notes; keep those notes in a very accessible place--perhaps publicly online or in GitHub. If you don't do it for months and months, then it can still be a pain when you forget.

When you ran sudo systemctl status mariadb , the results were somewhat puzzling. I had assumed that installing the mariadb server will install the client. Now I'm not so sure. So, for the record:

apt list --installed | grep mariadb
mariadb-client-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-client-core-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-common/impish-updates,impish-updates,impish-security,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 all [installed,automatic]
mariadb-server-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-server-core-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-server/impish-updates,impish-updates,impish-security,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 all [installed]    

When you see version numbers, you generally do not want to use those. You almost always want the latest available. The situation gets even more puzzling:

sudo apt install mariadb-server
# already latest
sudo apt install mariadb-client
# something new was installed.  Huh?

Whatever the case, does the "mysql" command exist? I realize you are going to use MySQL Workbench as a client, but you'll want that mysql command. "sudo mysql" will get you in as the root mysql user so that you can set the root password for non-root Linux users. That is, mysql (mariadb) assumes a relationship between the Linux user and the db user unless you specify otherwise. You can get into mysql as mysql-user-root with sudo, but MySQL Workbench will need a MySQL root user password because you should not be running MySQL Workbench as root; it's just bad form.

I am puzzled why your mariadb status talked aobut the "mysql" (client) program at all. My does not. It was somewhat unconnected that I suggested what was wrong with your Workbench connection.

For future reference, I'm not sure how clear the distinction is in MySQL Workbench between "server not running" and "server is running but I, Workbench" cannot connect to it. In some ideal world one of us would take the time to document that. That distinction led to some confusion over the last few days.

As I think about it, your status results were even more puzzling. This is what I get:

systemctl status mariadb
● mariadb.service - MariaDB 10.5.13 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-01-28 02:50:44 EST; 14h ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
   Main PID: 1179 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 9 (limit: 9458)
     Memory: 217.9M
        CPU: 1.779s
     CGroup: /system.slice/mariadb.service
             └─1179 /usr/sbin/mariadbd

Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] InnoDB: 10.5.13 started; log sequence number 862675097; transaction id 15437
Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] Plugin 'FEEDBACK' is disabled.
Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] Server socket created on IP: '127.0.0.1'.
Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] Reading of all Master_info entries succeeded
Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] Added new Master_info '' to hash table
Jan 28 02:50:44 ubu2110 mariadbd[1179]: 2022-01-28  2:50:44 0 [Note] /usr/sbin/mariadbd: ready for connections.
Jan 28 02:50:44 ubu2110 mariadbd[1179]: Version: '10.5.13-MariaDB-0ubuntu0.21.10.1'  socket: '/run/mysqld/mysqld.sock'  port: 3306  Ubuntu 21.10
Jan 28 02:50:44 ubu2110 systemd[1]: Started MariaDB 10.5.13 database server.
Jan 28 02:50:45 ubu2110 mariadbd[1179]: 2022-01-28  2:50:45 0 [Note] InnoDB: Buffer pool(s) load completed at 220128  2:50:45

In my case, it tells you the port is 3306. Or:

sudo netstat -tulpn | grep -i mariadb
tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      1179/mariadbd  

Your router is not at issue. The router has no part in the "network" traffic inside your computer. One diagnostic would have been to run "mysql" and / or "sudo mysql"

January 27, 2022

As with a week or two ago, cue "*autistic screeching*". If I haven't said it before, I hate OAUTH2. So, for the record, if you add a redirect URL in Google Cloud Console, you have to add the URL to your client_secret file IN THE SAME ORDER!!!! Otherwise, you get '{"error":"redirect_uri_mismatch","error_description":"Bad Request"}'. Here is my debugging code.

To make matters worse, I am using debugging code in part because I could not think of a good way to conform to my rule #1. That is, given that the redirect URL is called from Google, how do you get Google to put a precise URL query in the URI such that it invokes the debugger? I tried putting the xdebug query in there, and I'm fairly sure it simply told me it was an invalid URL. I found instructions for getting Google to pass data in a "state," but not a literal, precise URL query. The debugging code shows were I use file_put_contents with FILE_APPEND. It led me close to the problem, and then it occurred to me that order might matter. At least, I'm 87% sure that's the problem. I am not going to reverse it right now just to check.

This is almost reason in itself to get away from Big Evil Goo and run my own mailserver on my own domain.

January 26, 2022 - ongoing entries, 17:15, then 19:26

debugging MariaDB connect issues

If MySQL Workbench can't find the service, the service probably isn't running. Usually a service starts when it's installed, but not always.

# no harm in doing a restart if it's already running
sudo systemctl restart mariadb
sudo systemctl status mariadb
# ...      Status: "Taking your SQL requests now..."
# That's funny.  I've never noticed that before.
ps -Af | grep -i mariadb
# 7278 is the process ID.  Yours will be different in all probability
# mysql       7278       1  0 19:27 ?        00:00:00 /usr/sbin/mariadbd
sudo netstat -tulpn | grep -i mariadb
# 3306 is the port I would expect
# tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      7278/mariadbd       

snap versus apt 2; Firefox 2

A double installation was not the problem I had. It was something much more subtle. If I may be so bold, your whole email should be entered into the record. You're anonymous for now, so I'll just do it.

I have spent well over half and hour doing all sorts of such things.

funny, the only issue i had with snap vs apt was exactly that, firefox. i just had to apt remove firefox, because somehow i had 2 copies of firefox floating around... randomly, (well probably not random at all) in certain apps when I would click a link (like discord or thunderbird) it would open in the older firefox (apt) version, and the rest in my new firefox (snap install). the only reason I could differentiate the 2, and know which one to uninstall, was when I went to help/about in firefox, it actually said it was a snap package.. so I knew to remove the apt version.

previously to that, i wasted about a half hour trying to find out how to set the default browser to load in thunderbird when a link is clicked, and sadly i didn't find any way to do it.

snap versus apt

The trend appears to be towards snap, so if there is a snap, go with it. I was playing with something once that was causing trouble as a snap versus an apt, but it wasn't that important an issue. I think it was Firefox, but I'm not even sure.

MySQL, MariaDB, and MongoDB continued

In response to your email a few hours ago... You may have misunderstood the choice I was positing. If you're using Ubuntu, MariaDB is the way to go. There are supported packages for MariaDB. MySQL was abandoned by much or most or almost all of the open source community over concerns that Oracle would corrupt it. (Oh, Oracle is a funny coincidence, huh?)

As best I remember, every site I've seen over years has moved to MariaDB, except perhaps for the insane people who were still using PHP 5.x.

Yes, I do recommend installing MySQL Workbench directly from Oracle. I don't think there is an Ubuntu package for it anymore. MySQL Workbench gives you an enormously prettier SQL command line to work with, as opposed to the pure "mysql" command line. Even though I'm using MariaDB, the commands have stayed the same, so it's still mysql. MySQL Workbench will "express" "confusion" over connecting to MariaDB, and it will warn that it's not supported, but it will work.

So, going back, the choice I was positing was not between MariaDB and MySQL. The choice was between MariaDB and MongoDB. I have probably a few hundred lines of working MongoDB code in my repos. I have 0 MySQL code on GitHub. I drank the MongoDB (object oriented database) Kool-Aid in 2017, and I've started using it whenever I have the choice. I won't be using MySQL for my main (part-time) project for much longer. As soon as I free it from Drupal, I'm done with MySQL.

MySQL is not going anywhere, though, and I'm in the minority of PHP and MongoDB users. MongoDB is much more closely associated with Node.js (MEAN / MERN stack). MySQL will have much better examples out there, too.

I said that there would probably be some grumbling on my part if you chose relational (MySQL) over OO. We'll see when and if the grumbling begins.

January 24 - 26, 2022 (20:06 - entry 2 1/24; first entry of the day drafting at 02:14 1/24, posted 1/26)

entry 2 - 20:06

Going back to the previous entry, I can now delete the .htaccess I created for this directory, because now the whole site is forced to httpS. For the record, it looked like this:

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,NE]           

This evening's main topic, however, is my moving towards creating a new instance for kwynn.com with the Nu HTML validator built in. I'm going to have to add Java, and that's probably enough of a disk space issue that I should up the disk space while I'm at it. I've been bordering on needed to add disk space anyhow. I have a 12 GB disk which can get around 83% full when updates are pending. My main client has a 16 GB disk, and that only ran out once when I was asleep at that particular switch for months and months. So I'll go to 16 GB.

Hmmm... AWS now has an Atlanta data center for EC2. On one hand, that's tempting. On the other hand, seems I should keep my system away from me such that any incident of the century doesn't affect both. On the gripping hand, the DC area seems somewhat more vulnerable than it has in decades.

The goal of this next instance is simply exploratory, so I'll stick with Virginia for now. It looks like t3a.nano is still my best bet. The t4 systems violate my rule #2 way below--the live system should be as close as possible to this computer I'm typing on. The t4 systems are ARM rather than x86_64.

Doesn't look like Atlanta is an option. Their instance type availability is very limited for my purposes. Boston is also limited; Boston is relatively new, also. Anyhow, a quick scan says still in Virginia with a t3a.nano at $0.0047 per hour.

To back up a bit, it's not just a matter of space. I'm concerned about just installing Java in my live environment. Java is reason enough to test the environment.

How much more expensive is gp3 ssd? Kwynn.com is currently running the security group labeled launch-wizard-3. Boy does AWS create a bunch of those by default. I will create a new key pair / PEM while I'm at it.

Upon entry into new instance, turn off cron jobs. It appears Nu needs a JDK in addition to JRE because it compiles stuff (javac). And then 512 MB of memory is not enough to compile. Time to invoke that .large EC2 type from months ago. Enable proxy and proxy_http. Open port 9999 in the security config.

My conclusion for now is that Java and / or Nu are too much of a RAM pig to put on a t3a.nano instances with 512 MB RAM. When testing, everything worked, but at one point Nu hung for something like 30 seconds. Right now on my local machine Nu / Java takes 300 MB.

entry 1 - 02:14

For kwynn.com, I finally did the server / virtual-host-level rewrite rule to redirect all http to https. Here are the running Apache config files. I may have aged a little bit just before I restarted Apache to see what would happen.

January 22 (first entry after 17:03, continuing 18:35)

Apprentice is trying again to run my chat code, as mentioned yesterday. First, I have a chuckle that "php fatal error" must seem ominous to him. On some days I probably get several fatal errors an hour while dev'ing, usually syntax errors. I'd imagine I've mentioned it before, but I remember in the C language in 1992 syntax errors could take me 3 - 4 hours to solve. Helping with such things is what I see as one of my major jobs as master dev for an apprentice. I don't think I learned all that much taking 3 - 4 hours to solve syntax errors. I probably just got that much closer to getting frustrated and changing majors.

In fact, I know someone who did that in 1985. At that point, the compiler simply said "syntax error." No line number. He'd had enough at that point. I at least had a better indication in 1992.

It's rare that syntax errors take me more than a handful of seconds to solve these days. Some of the most puzzling instances that have taken longer involved my violating my own rule #2 to the effect of your dev environment should be as close as possible to the live system. In those cases, the "Heredoc" PHP string syntax worked differently in PHP 7.4 versus 7.2. Thus, something worked on my system and gave a mysterious syntax error on the live system. Worse yet, the include / require_once() tree was such that a syntax error crashed the whole system. I only have one user of that system, and it was late enough at night that I doubt he had any intention of using it, but still.

The difference in "heredoc" was something about whitespace--the higher version of PHP allowed whitespace in places that 7.2 did not.

Anyhow, I have already emailed him back. I am not tormenting him waiting for this entry. I still don't know exactly what is wrong because I'd need to be looking at his system. However, I can give some (more) general advice and repeat what I said in email.

When I say "include," I should clarify even more than I did in email. Generally speaking in a number of dev languages, an "include" means to incorporate code. The precise keyword "include" goes back to C if not before. To make the history somewhat more complicated, includes work differently in C because C is compiled. I'll stick with PHP from here on out. If I say "include" in PHP, I am using the word in the general sense to mean incorporating code. In terms of precise syntax, I almost always mean require_once(). I would state it as a general rule that one should use "require_once()" unless there is a really good reason not to. I would guess that I use require_once in 99.7% of cases. I can think of one specific case of using include and one case that I can't quite remember the details of.

Anyhow, on to his specific problem and the general debugging methodology. Once again, in this case I'm using "debugging" more generally because a debugger may or may not help him in this case. It may not help because his situation is akin to proving a negative. He's getting an error very close to "Class 'dao_generic_3' not found in line 54."

His first problem is very likely my fault to a degree. dao_generic_3 is my creation in this file. I probably should have called it kwynn_dao_generic_3 or something to make it clear that it's my creation.

So let's back up to the more general case. If a function or class isn't found, the first question is how far from the PHP core / default it is. If the function is in the PHP core / default install, then you should never get a not found. There are a few classes of functions, though, that are not installed by default. So, step 1 in debugging is to search the function / class on php.net.

To take an example that I encounter all too recently, let's try "mysql_escape_string." The PHP doc will inform you that the function and "the entire original MySQL extension was removed in PHP 7.0.0." What part of that is unclear? This is one of several situations that wasted enormous amounts of my time. The PHP release history page informs us that PHP 7.0.0 was released on December 3, 2015. Let's just say that I came across the mysql... function fairly recently. When I spluttered incoherently about encountering this problem for the first time in YEARS, I was told "no other developers have complained about this." Yes, those would be the developers who are happily charging you $150 / hours (a real figure) to maintain an ancient environment.

Similar indications of madness were that one of the systems--perhaps the live system--had a clock that was several hours off in part because the system hadn't been rebooted in 900 days or something. A few hours later I noticed that someone had used the "date" command and reset the clock.

I don't expect other people to be insane as I am about timekeeping, but *hours* off!?!?! Having to use the "date" command?!?!?!

Anyhow, I digress. The point being that if you ever have the misfortune to see a "mysql..." function, the PHP documentation will at least tell you what's wrong. Solving your problem is a rather bigger issue, especially when the people involved are likely insane. This brings up an interesting question. Can developers diagnose insanity based on years-old functions or hours-off clocks? Perhaps so.

Now I'll give a much more reasonable example. In kwutils.php is getDOMO(). It calls DOMDocument. The PHP documentation within a few clicks mentions "This extension requires the libxml PHP extension." This situation is admittedly at least mildly painful until one gets used to such issues. The solution in this case is "sudo apt install php-xml" in Ubuntu. Hopefully Big Evil Goo gives that solution simply. I knew it from searching my install list.

A related problem I've had involves one of the MYSQLI constants, and that's MYSQLI, the current version, as opposed to MYSQL. I remember chasing my tail for quite a while because a MYSQLI constant wasn't found. The solution was as above to install mysqli. Ironcially, it's been so many years that the install is "sudo apt install php-mysql" rather than mysqli. Both Ubuntu and PHP assume that it's been so long that if you're still running the old mysql library you are insane and beyond help.

Anyhow, to continue down the path of undefined stuff, and back to apprentice's specific problem, he would hopefully find that dao_generic_3 is not defined by PHP or anything he can find. Again, this is an argument that I should have called it kwynn_dao... I should also perhaps make it more obvious more often that /opt/kwynn on my system is a clone of what I generally call "kwutils." In his case, he does know these 2 points. We just had another exchange while I'm writing this. My new advice is of general importance.

To back up a bit, though. Sometimes I do something like "cd / && sudo grep -R dao_generic_3" That would quickly result in "opt/kwynn/mongodb3.php:class dao_generic_3 {" Much could be said about that, but let's stick with this specific debugging path:

I didn't demand he recursively grep. Given that the details of the above are not obvious, I mentioned that the include execution should go from kwutils.php to mongodb3.php. The fact that it's not in his case is baffling. We're both fairly certain he has the latest of both repos of code.

Which brings me to:

minimal functionality debugging technique

I want to run around in circles waiving my hands in the air and screaming when I see people post 100 lines to a forum and want debugging help. First of all, I not criticizing my apprentice nor even the people posting to a forum. Because a main job of mine is to reduce frustration, my apprentice should be asking me sooner than later. And here we are a while later, and I still don't know what's wrong in his case. The solution is not obvious, and even if it was obvious, I don't want him spending large amounts of time on such things.

With that said, for future reference for forum posters and apprentices, the minimal technique goes something like the following. I am assuming he is using the debugger and stepping through line by line. Actually, I'll put this in GitHub. One moment please....

Here is my minimal debugging example.

January 21 (2 versions -- 16:39 and 16:56)

This is the part several minutes later. So that means you have "base" PHP / cli PHP running or you would have gotten an error of program not found. As I mention below, server.php has no output yet--that is, it has no output when it's run as PHP. I should add while I'm at it that sometimes I'll use "exit(0);" or "return;" (or "continue;") for no apparent reason. I'm not entirely sure about NetBeans 12.4 and / or the latest xdebug, but slightly earlier versions needed some code on a given line to make it a breakpoint, so I will write exit or return or continue. That's why I also created the null function kwynn(); Sometimes I would use "$x = 1;" which causes great gravy grief is one is processing weather radar data with x and y coordinates. *sigh* So THEN sometimes I'll use "$ignore=34523;"

The incident with the weather radar took me way too long to debug.

Back to server.php. No, you don't have to run server.php separately because if Apache PHP is running, the browser calls server.php and thus runs it. However, running server.php from the command line is a good idea for debugging purposes, as I mentioned a few weeks ago. Generally speaking, there is an argument to be made that web code should also run seamlessly in CLI mode because it's often easier to deal with / debug code in CLI mode. That's why I created the kwutils.php function "didCLICallMe" because I set up various files to run standalone in CLI mode.

Below is the first ~16:39 entry:

I started a web chat app. I'll point to the specific version in question. When apprentice hits send, he gets an XML parsing error, unclosed tag, in line 1, column 1, from the output of server.php. That is, he gets that in the JavaScript console. I assume that PHP is not running at all, but Apache is very likely returning the Content-Type as text/html in the return HTTP header. Apache is doing that based on the .php extension. So the browser is expecting HTML which these days should be in the form of XML. If PHP is working, the response would be completely blank because the whole server file is a PHP application with no output. I have put my debugger on line 6 and confirmed that the server is receiving data, so that's a form of output to me for the moment, to know that it's working so far. kwjssrp() in PHP and kwjss in JS are both in /opt/kwynn/js, and the relevant PHP file is included (require_once()) from kwutils.php. As a reminder to the hordes of people reading this, /opt/kwynn is a clone of my general utils repo.

Back to the browser--it sees the < opening tag of php, and the PHP tag is not closed for reasons I have gone over rather exhaustively, below. If you look at the server.php output in the browser (the HTTP response), it will almost certainly be the precise PHP code rather than executed code (which in this case would result in ''). The browser is trying to interpret this as HTML / XML. So, again, my assumption is that PHP is not running at all through Apache; Apache is just running the raw file. Given that this is your brand new install, I assume the solution is

sudo apt install libapache2-mod-php8.0
sudo systemctl restart apache2

If you haven't installed base PHP yet, I think the above will do it.

January 15, 2022

As I'm writing this, as of 18:37, apprentice reports a successful install of Ubuntu 21.10. It should be recorded as 21.10 rather than 21.1 because the Ubuntu format is YY:MM. Ubuntu releases every 6 months in April (04) and October (10). The even-numbered years in April are LTS (long term support) releases, meaning support for something like 5 years. The other 3 releases in the cycle are supported for something like 9 months, so that you have 3 months to upgrade.

As an action item, does the following work?

sudo apt update ; sudo apt dist-upgrade

When I say "work," I mainly mean do you get any indication of broken packages? You may have no packages to install. You may also get an autoremove message. There is no need to do the autoremove step, nor is there any harm. I have never had a problem with autoremove. Also, it's possible you'll get a message about another process having the lock because "unattended upgrades" may be in progress. Generally, that's fine; just let the unattended finish. Sometimes unattended dies, though, and you'll sit there forever waiting. Usually that only happens when I'm updating a system running on a USB drive plugged into USB 2.0. (I have old hardware. I do have a USB 3.1 PCI card and use it all the time, but my hardware is old enough that it won't boot off of PCI devices.)

Which brings some more commands, in this case to know when a process like an upgrade has died. One is "top". I would play with that relatively soon. It goes in the category of you should know what it usually looks like, so you know how to look for anomalies. When an upgrade is working, or any other disk-intensive process, the "wa" (processes waiting) number will be very roughly 16 - 25% as opposed to near zero.

"sudo apt install iotop" and then "sudo iotop" is similar in that it does the same thing with disk data transfer. Again, I recommend looking at what's normal.

When I start a new partition / system / install, I don't bother moving data as part of the process. I move the data from the previous partition as needed, which is a good way to keep things clean.

Ubuntu comes with backup software. I have never played with it, but it might keep track of full backups versus incremental backups. That would solve your redundancy problem. The "rsync" command does that, too. I've been using rsync for the last several weeks to upgrade kwynn.com with this local copy of the DocumentRoot tree.

Please tell me you didn't install 3rd party software? If you did, just stay on top of broken packages.

In answer to one of your points from weeks ago, I just moved the Ubuntu Symphony JavaScript to an external file. I did this in part because I have no plans to touch the code ever again, and it was sitting there in this file in my way.

written before I knew you were done with your install

As for your immediate problem, I wasn't clear on at least a couple of points. Just below I go back to immediate solutions. But, first, to clarify one point: not installing 3rd-party / proprietary software is for future reference; at the moment, that damage has already been done. For future reference: when you install Ubuntu, there are only a handful of (sets of) options as you start the install. One of them is whether you want to install 3rd party software. It asks you this about the same time as it asks whether you want to update packages as part of the install. For future reference, I recommend against installing the 3rd party software, in part because I think it caused your problem. I suspect it's caused me problems in the past, too.

But back to the immediate problem, you may be able to solve everything if you just remove those 2 packages that are causing you problems. In case it's not clear, it's possible everything you're describing traces back to that. I will reiterate the point: if you see any indications, ever, of broken packages, back up your data and then attack that problem.

In the last few hours, I booted into my partition that is broken in a similar manner as yours. I was hoping to start removing packages to definitively record how it's done. My partition is far more broken than yours, though. The NetworkManager service was apparently "masked" by the cascading failure I had weeks ago. I had to look "masked" up. It's a step beyond "disabled" in systemctl / systemd. That means I had no network. My attempt to "unmask" it deleted it rather than unmasked it. I have almost no clue what that's about. Without a network to look things up and let Ubuntu load software, I decided it was time to give it up on that partition.

"$ man apt" indicates that it may be as simple as "$ sudo apt purge <package>" or "$ sudo apt remove <package>" After messing around with that broken partition, though, I decided I am not removing anything at all on any of my partitions.

If you decide to start over, I would install 21.10, the Impish Indri; I've been running it since soon after it came out. Impish is not LTS, so the consequence of this is that support will end in roughly July of this year, so by that time you should install 22.04, which will be LTS. My jury is still out on this decision in general, but I am leaning towards keeping up with every 6-month, full version. Probably the way to go is to upgrade your dev machine to every full version and then upgrade any clients once you're satisfied that their software will work.

One of the reasons I say this is because in addition to 3rd party software, I think you were zapped by Ubuntu's deprecation of 32 bit systems (i386). I'm assuming Ubuntu deprecated them some time after 20.04 based on observation; I'm sure you can find the official doc.

"Try Ubuntu" and a permanent USB install to let you boot

A USB installer has a "Try Ubuntu" mode. I recommend always having an installer USB on hand to give you something to boot from. Note that the "Try Ubuntu" mode will not save your data, though. Any data saved to the USB only lasts until reboot. You can of course save to a standard filesystem.

I also recommend installing from one USB to another USB such that the second one has a permanent install. Then you can do work on the full install. Also, the full install can be set to boot to your SSD or hd partitions. (You can do multiple installs onto a USB just like onto any other disk.) Note that when you are trying to boot from A to B, though, A must have a common kernel with B. Thus, you need to periodically update the USB. Also, when you are in Grub (just after power-on-reset) you'll notice that there are various options. Several of those options allow you to look for a common kernel. If you don't have a common kernel, when you try to boot into a partition, you'll get a message to the effect of "cannot load."

After you update the USB, "sudo update-grub" is what syncs the USB with what is available to boot to on other disks. Note that update-grub will set the partition running grub as the default boot partition from that physical device. This can get confusing if you run grub from an SSD or hd. If you had 2 partitions on an SSD and you update-grub from one, it would then be the default, which may not be what you wanted.

As I mention this, you should look into partition numbers and disk names (/dev/sda7). You should know what your / actually is on the /dev level. The spacing of this is very precise: "mount | grep ' on / '" and / or just run mount to look at it. I can't imagine what harm you could do if you run mount as the standard user (not sudo).

data management, commands that might save you in case of trouble, and one means of backup

You may do this anyhow, but consider various degrees of separation between a filesystem needed to boot / run and large data that is slower to copy. That is, your data and running files go into at least 3 categories: the files needed to run Linux, your small data such as code you type, and then various degrees towards your large data. Consider keeping your large data in one file tree that can be isolated or better yet on another partition. It makes it easier to know what you really need to back up, and it lets you copy a whole partition and then set it up to run.

The following is not definitive in that I am not going to take the time to test it over and over. It's something to explore.

One reason I mention all this is that now that you have a running, pure partition, possibly with little of your data, you might want to make a copy and learn how to make that copy runnable. First, I highly recommend doing the copy itself from another booted system, such as an install disk running in "Try Ubuntu" mode.

I often do the copy with the GUI: "sudo nautilus" Nautilus is the "Files" manager. If you run it as root, it can copy anything. There is a recursive "cp" that does this and preserves dates and users and such, but I lose track of what all those switches / arguments are.

So, make your copy. You should probably give the copy's partition a name and partition label. Otherwise, in /media/<your user> you'll get a UUID and have to figure out which one is which. In any event, navigate in the command line to the root of the other partition, which will start as /media/user/something. Then I give you the commands that are not harmful as long as you exit the shell when you're done, or at least exit from chroot. Then I give the last command that is probably not harmful but may be confusing.

First, you might want to mark both / and both /home/user with a file indicating which copy is which such as "iam_2022_0115_18_install"

# be in the root that you want to make THE root.  That is, pwd shows the relevant root
sudo mount -t proc /proc proc/
sudo mount --rbind /sys sys/
sudo mount --rbind /dev dev/
sudo chroot .
      

At this point you will be the root user (indicated by #) and the filesystem root will be the "new" root that you want to make independently bootable. Then I *think* a "sudo update-grub" is enough to make it independently bootable. I would not do this, though, until you are sure you can boot into a working partition from USB as above.

The reason you need to set grub from the new partition is that otherwise all of its mount points are set to the old partition because it is a copy. Before grub, the chroot sets / to the new partition, then grub sets the bootloader to recognize this new partition as bootable.

NetworkManager - something I learned today.

I already knew the systemctl commands below, but I didn't know the name of the network service.

Somehow when I booted into the broken partition, my active NetworkManager unset itself to a degree as I'll explain. That is, when I booted into the active one, I didn't have any network. I had to both (re)start it and re-enable it to start on boot:

sudo systemctl start NetworkManager
systemctl enable NetworkManager
      

January 14, 2022

Ubuntu package problems and some solutions

My currently active apprentice is having package (software install) problems. These are some thoughts.

For one, when installing Ubuntu and probably other Linux distributions / "distros" / sub-brands, the 3rd party / proprietary packages / software are probably not worth it. That's one question Ubuntu asks by default upon install. I suspect that's the root of his problem. Specifically: when he runs "sudo apt install --fix-broken" these 2 packages have problems: "Errors were encountered while processing: libgl1-amdgpu-mesa-dri:amd64 libgl1-amdgpu-mesa-dri:i386"

When I run "apt list --installed | grep libgl1" I get libgl1-mesa-dri and libgl1/impish. I am reasonably sure this means he installed AMD specific drivers and I did not. I can run FlightGear reasonably well on very old hardware. I'd have to go digging for my graphics card specs, but I remember it only has 1 GB RAM which is old. Maybe the proprietary driver would help, but, as above, it's probably not worth the cost benefit, and I should get better hardware when I can.

I suspect he's also having problems because Ubuntu is pulling away from i386 (32 bit). That is an argument for installing the latest Ubuntu rather than the LTS version, but that's another discussion.

The GUI / Gnome / desktop was probably almost literally making noises about a package problem. Whether it was or wasn't, there is an argument to be made for checking by hand perhaps once a week:

sudo apt update
sudo apt dist-upgrade
       

If you get the package error message, do not ignore it. In fact, it's probably time to backup your computer. This has not happened to me often, but it has gotten out of hand perhaps 3 times in 12 years. Usually I was asking for it, but that's another discussion. (Below I list one instances of trying to downgrade PHP.)

In any event, solve the package error ASAP.

The solution to the problem is usually to remove the packages in question, and really thoroughly remove them. I can't think of a way to quickly set this up to demonstrate, so I'll have to refer ya'll to Big Evil Goo. The commands involve such terms as "purge," and then I can't remember if the usual term is delete or remove.

Note that you may have to use the dpkg command for this process, and you may have to use aptitude rather than apt, although if it's not already installed, you probably can't install anything while that error is pending.

Command list:

# a list of commands or their close equivalents for package management
# use these 2 often, perhaps once a week or more:
sudo apt update
sudo apt dist-upgrade
# the rest are not harmful but only useful in context
sudo apt install --fix-broken
# I have never used the audit option before; I was simply looking for something harmless but to remind one 
# of the dpkg command
# You may not have this package at all.  
sudo dpkg --audit libgl1-mesa-dri
apt list --installed      
# which package did x file come from:
# this is a small part of answering another question of yours
dpkg -S /usr/bin/find
# not directly relevant, but in answer to another issue
tail -n 20 /var/log/syslog       

For problems with running software as opposed to their packages, many pieces of software have a verbose mode or variously levels of verbosity. Sometimes you can shut down the systemctl version of the program and run it "by hand" with the direct command and verbose turned on.

You asked if some of these commands are the direct equivalent of what the GUI does. Well, update and dist-upgrade are the equivalent of some of it, and as you discovered, sometimes doing it by hand is an improvement because the GUI will lose track of changing IP addresses. That is, always do an update before an upgrade.

The above will not upgrade you a full version such as 21.04 to 21.10. There is of course a command line for that, too.

You mentioned digging around on various errors in the syslog. Unless the timing makes it certain it's what you need to focus on, I'm sure there is all sorts of junk in syslog that a perfectly running systems spits out. You could spend forever chasing such things.

software (un)installs versus Satan's OS (SOS) / Linux file organization

I only have a fuzzy understanding of how the files are laid out--libraries, binaries, datafiles, etc. In the immediate context, solve your package problem first. I wasn't trying to discourage you from learning more; I was just saying solve that problem first. I don't think a better understanding of the overall system will help you solve the immediate problem.

You pointed out in SatanSoft that you can delete software more easily. There are a number of reasons for that. For one, almost all software is using SOS code. In Linux, you have an ocean of free software to choose from. The Nu HTML Validator uses both Java and Python, for example. There are so many choices of how to build software that it leads to so many more packages. Open source software is build on layers and layers and layers of open source software. The dependency graph is much more complex.

For a number of reasons, SOS apps are already compiled for a known OS, and SOS takes care of any small differences in hardware. The software is already a more-or-less standalone set of files when it ships. Lots of Linux software is interpreted rather than compiled, so its dependencies are not compiled in. Also, Linux runs on such a variety of hardware that an app couldn't compile for a known hardware target anyhow.

Another way of phrasing the above is that SOS software can make assumptions about what libraries are available because those libraries come with SOS. With Linux, one can make very few assumptions, so one has to make a package dependency list which is turn invokes other dependencies.

Again, many more choices and possibilities.

Linux packages do have an uninstall. I just haven't listed the command because I don't have anything I want to uninstall that badly. Also, with Linux, how deep down the dependency tree should the uninstall go? It can't uninstall stuff that other software depends upon.

Upon thought, if you continue to have trouble, I can do LOTS of uninstalling on one of my partitions. This was weeks ago when I tried to downgrade to PHP 7.4. I used a PPA that I had success with before going forward, but going backward got out of hand quickly. I wound up with errors such as "yours" of the day, and I abandoned ship and built another partition. I can go back to that old partition and start stripping it down to get rid of stuff that doesn't work anyhow. I did NOT lose data. It would be difficult, in fact, to use data due to package issues, even if the system was unbootable. You could still get at the data with another running system whether on another partition or a USB.

several emails ago

You said that you've read some of these entries 3 times. That doesn't surprise me. In case there is doubt, I realize that a lot of this simply cannot make sense yet because you don't have the context. Hopefully in a small number of months, you can re-read these and get more out of it.

January 13, 2022 - broken HTML validator - continuing 02:38

I have the Nu HTML validator working locally. I cloned it to /opt and then this works: python3 ./checker.py all and then python3 ./checker.py run .

Days ago nattered on about the referer HTTP request header as it applied to the W3 HTML validator. Coincidentally, I just realized that the referer is essentially broken for validation purposes. It appears that in March, 2021 (version 87), Firefox started to "trim" the referer down to the domain name. It would appear that I have a lot of work to do to fix this.

I already fixed it for this page, or at least I gave it the quick-and-dirty fix. I just hard-coded the URL to kwynn.com. That won't work for a fully online test system--that is, a test system accessible from the outside.

The right solution is JavaScript populating the URI / URL on page load. But I have to add that JavaScript to 156 pages according to my grep -R. That is one of the rationales for a single page system--route all page requests through one program that adds / changes whatever needs adding / changing.

January 12, 2022, one of several entries today, starting this at 23:26

Updated 23:47 with some answers. Updated 23:54 regarding http://validator....

I'm consider several changes to my Apache config, several of which I've tested. Some notes below. I am not using newlines in a realistic way--that is, the following is not precisely correct.

January 12, 2022, one of several entries today, starting 20:43, first posted 22:20

(I made small revisions at 22:38.)

And back to my web dev Q&A with my apprentice. He is the "you" to whom I'm originally speaking.

First, something I should emphasize again about sending email. Among basic features that should be simple, it is one of the harder things I've done over the years. I'm sure I've done lots of harder things, but nothing that is both such a basic feature and so hard. Part of the problem is that there are Federal and presumably many other laws around the world around spam, so the providers are very cautious.

To clarify for myself and everyone else, your point about "from scratch" was about memorizing stuff. There are basic things I've done several thousand times that I still look up in some sense of the term "look up" (see below). That's in part because PHP is inconsistent in the order of function arguments in a few places, but that's not by any means the only example.

I'm trying to stop beating on php.net and trying to grep -R my own code. I will suggest something that I have only done sporadically:

Create your own page and / or your own program that demonstrates all the things you keep looking up. Exaggerate the clarity of the variables. I tend to use very short variable names because I get tired of typing them, but I should take my own advice when I'm giving examples.

I think it'd be very funny and geeky and cool to have one bigger and bigger program that demonstrates all of the little "primitives" of coding, where I'm using the term primitives to mean the syntax, the functions, the order of the function args and what they are, snippets of doing a bunch of simple little things, etc.

Alternatively, I have downloaded all the PHP doc, but I never fully installed it. I've also considered installing it on Kwynn.com. I've also also (sic) considered installing the "Nu" HTML validator that W3 runs. Recently I noticed a reference to installing it yourself.

Other than my not understanding the memorizing key to your question, we seem to be on the same page.

What, me using uncommon words such as Quixotic? I would never do such a thing.

WINE is a mixed bag. Around 5 years ago an apprentice got Call of Duty working in Linux running on iCult hardware. I don't remember if it was perfect, but it at least worked fairly well. I've had some success with WINE, but I'd say it's still a pain. A potential apprentice recently mentioned Zorin Linux, which is based on Ubuntu. Apparently the emphasis is on making it easier for people to migrate from SatanSoft. The Zorin WikiPedia article mentions both WINE and PlayOnLinux that Zorin encourages.

HTTP headers

Regarding the "header thing" or the "DO NOT CLOSE PHP TAGS UNLESS..." thing, which I got into in the last few weeks, this issue is a special case, just like sending email is a special case. The email thing you came up with on your own, in that you brought it to me. The tag thing I'm shoving a screeching at you because it has caused me enormous damage. For one, it's a special case because it goes in the "hear me now and believe me [and understand me] later" category. It's probably not worth taking the time to reproduce the dreaded "...cannot be changed after headers have already been sent" error. With that said, I'll give some explanation.

Below is a relevant example. I have removed a number of lines to make it smaller.

$ curl -i https://kwynn.com/robots.txt
HTTP/1.1 200 OK
Date: Thu, 13 Jan 2022 02:11:22 GMT
Server: Apache/2.4.41 (Ubuntu)
Last-Modified: Sun, 04 Oct 2020 04:13:01 GMT
ETag: "195-5b0d094ed72a3"
Content-Length: 405
Content-Type: text/plain

User-agent: *
Disallow: /t/8/01/wx/wx.php
Sitemap: http://kwynn.com/t/1/01/sitemap.xml
Sitemap: https://kwynn.com/t/20/10/sitemap_begin_2020.xml

When your PHP runs in "web mode," it is usually responding to an HTTP GET or POST, which I'll demonstrate more specifically in a moment. It must respond to an HTTP request with a result specified by the protocol (HTTP). It's doing some of that behind the scenes. PHP *MUST* generate headers for the browser to accept the result. The curl command above generates a GET, and that is the result including the headers, minus a few lines. I'm not sure which of the above are absolutely required in an HTTP response, but, whatever the case, the browser is expecting a number of lines and a number of types of lines and then a double newline. Everything after the double newline is the body which is often the HTML itself. If you output anything, even accidentally, from PHP before the proper headers go out, you may break your page.

If that's all there were to the "cannot be changed" issue, it wouldn't be so bad. But, believe me, at least years ago the results were unpredictable to the point that it seemed like a virus. (A real computer virus that does damage to data, as opposed to fake viruses that do nothing to people.) At that time, I was just starting to do hard-core PHP dev, and I could not figure out what was going on. (I think I already was using a debugger.) I'm sure I Googled, but I guess it still took me a while to figure out what was going on.

I was going to show you what I thought was a more "pure" example of an HTTP GET, but it doesn't work quite like I expected. I think that's because I'm not sending a proper HTTP request packet, and my brief attempts to do so didn't get me anywhere. But it hopefully the follwoing gives you more insight. Note that I'm removing parts of HTML tags because it seems that the validator doesn't like CDATA, or maybe CDATA has been deprecated.

$ telnet kwynn.com 80
Trying 2600:1f18:23ab:9500:acc1:69c5:2674:8c03...
Connected to kwynn.com.
Escape character is '^]'.
GET /
!DOCTYPE html
html lang="en"
head
[...]
titleKwynn's website/title
[...]
/body
/html

You can see the request and response headers in control-shift-I network, and there are other ways to get at it. Note that if you ever parse a raw HTTP response, all of the newlines are old-fashioned \r\n rather than just \n in Linux. This becomes important if you try to separate header and body. I parse it with "$hba = explode("\r\n\r\n", $rin);" on line 34.

As for your having no experience with PHP and "header()," hopefully I've shown that this is a much wider question than PHP. Everything on the web uses headers.

As for the header() PHP function, that lets you add headers when needed. I hope you have realized that all headers must come before the main body of the output. :) Reasons I've had to use headers are:

Hopefully that gives you a lot more under the hood.

kwas()

In your kwas() example, you'll want a line before exit(0); that does echo('OK' . "\n"); I assume it's not outputting anything because you got sendmail installed, and sendmail is accepting emails for processing and thus mail() returns true. (As I said at great length below, that in itself won't get you much closer to sending an email, but anyhow...)

kwas() is about checking your assumptions ALL THE TIME. In your example, you didn't output anything in the event of success.

You did incorporate kwutils into your code just fine. You just didn't do anything to output in the event of success.

There is more to say on this, but I'll wait until you have more examples.

Here is an example of one consequence of require_once'ing kwutils. First without then with:

<?php
$a = [];
$b = $a['a'];
echo('got here' . "\n");
// RESULT:
[stderr in red:] PHP Warning:  Undefined array key "a" in /home/k/sm20/frag/kwuex.php on line 4
got here

<php // new program
require_once('/opt/kwynn/kwutils.php');
// then same 3 lines as above
// RESULT:
ERROR: kwuex.php LINE: 4 - Undefined array key "a" /home/k/sm20/kwuex.php

When I change the error handler in kwutils.php, warnings become fatal errors. I am certain this is the right answer in the real world. Just about all of my own code uses it. I haven't been able to fully prove it's the right answer in my main paid project because Drupal throws warnings right, left, and center. But the new version is most certainly going to take the position of forcing fatal errors. I made this change in response to annoyance at Drupal throwing warnings and continuing execution.

There is as always more to say, but that will do for now.

January 12, 2022, posted at 20:33 (started 18:11) - installing an HTTPS SSL cert locally

This is my second entry started today and the 3rd entry that either started or bled into today.

You have to have a domain for an SSL cert. For local dev purposes, I highly recommend using a subdomain (testinglocal.example.com) even if you're not using the base domain. One reason is that when in development, you change things so much that you might go over certbot's limits on certs. It's something like 5 certs a week for a given fully qualified domain. Thus, if you're using a subdomain, you can just use another subdomain (testinglocal2.example.com) rather than losing access to the base domain for several days. This isn't theory. I went over the limit several months ago. It snuck up on me.

As I muck around with my internet service provider's modem / router, I'm finding that my local system does not have a 32 bit IPv4 identity. This is important for firewall reasons. So, let's see if this works: "$ ifconfig | grep global" That results in 3 global IPv6 addresses. The first one didn't seem to work, then I added a second. Then as I ran curl from kwynn.com (external to my local system) and found which address it was using, I went back down to only one IPv6 address in the DNS AAAA record for the subdomain. You register DNS records with your domain name registrar. I use Hover.

My router's ping / icmp settings were somewhat confusing. The setting "Drop incoming ICMP Echo requests to Device LAN Address" had to be turned off for the "Global Unicast IPv6 Address" of the router itself to respond to ping. In order to ping my local system, "Reflexive ACL" had to be turned off. That needs to stay off through the certbot verification because the process needs a system passively listening on port 80.

Turn off any active local sites except the one in question. That is, disable them in Apache then restart. Below, "irrelevant" is the site / virtual host defined in /etc/apache2/sites-available/irrelevant.conf

sudo a2dissite irrelevant
sudo systemctl reload apache2

I set the one active virtual host to simply be receiving *:80: <VirtualHost *:80> Putting IP address in the VirtualHost top line did not work--certbot did not find anything listening, even though curl worked. You also need to set the ServerName to the fully qualified domain. Don't forget to restart Apache.

Note that the ufw default settings were not causing a ping problem. As for getting ready for the certbot: "$ sudo ufw allow http" and "$ sudo ufw allow https"

Certbot home, and relevant-to-me instructions.

First, do the "dry run": "$ sudo certbot -v certonly --apache --dry-run" Then do the real thing: "sudo certbot --apache"

Once SSL is working, you can put an entry like this in /etc/hosts:

127.0.0.1   blah.example.com

Once you do that, you can reverse all the security and even the AAAA DNS record because the site is now self-contained. Note that you have to open things up again to renew the security cert before 90 days.

If needed, set your firewall back to "Reflexive ACL" on. Then "$ sudo ufw status numbered ". Assuming you want to delete all the rules, "sudo ufw delete 1" to delete all the rules until they're all gone. Delete the AAAA record.

For cleanup later, when you're done with the cert: $ sudo certbot delete --cert-name blah.example.com

January 12, 2022 (AM) - the concept of "from scratch"

Note that my January 11 entry on email continued into today, January 12. This is a new entry started on Jan 12.

Yet again I'm continuing my web dev Q&A, usually answering emails from one apprentice. He is the "you" when I use "you."

You asked about Googling HTML and CSS and "from scratch." I fear you may have seriously misunderstood what I mean by "from scratch."

Before revisiting "from scratch," I'll separate what I do and do not recommend:

DO

do NOT, at least at first

When I say "from scratch," in part I mean use out-of-the-box JavaScript as opposed to a general library like React or Angular or even jQuery, at least to start. You should know how to use basic JavaScript. After gaining some understanding, perhaps in a month or two or so, then by all means experiment, probably starting with React. For now, for my own purposes, I have turned away from all of the above including React. I still want to whittle on my own JavaScript. At this rate it will be 6 months at least before I reconsider that. I put React highest on the list because I have not tried it and have heard good things. I have tried Angular and I found it to be a waste of time in the short-term. It took longer to learn the "Angular way" than it would have taken me to do it myself and even create my own portions of a library.

The same applies to using WordPress and Drupal and such in PHP. They fundamentally alter the dev landscape. Using WordPress in some cases may be a necessary evil, but you should still know how to do various things yourself.

You're going to be Googling HTML and CSS and many other things for your entire career (or at least until Google is renamed after being seized for crimes against humanity). Starting with "program by Google" is fine as long as you eventually understand enough to modify it. The long-term problem with "program by Google" is people who do that and barely make it work and then have no idea what to do when it stops working.

I need to fully distinguish PHPMailer from Angular. Angular fundamentally changes how you do things. Angular is a very general library that does a lot of stuff. Again, if you're going down that route, you should know how to do basic things in pure JavaScript first. PHPMailer does something very specific; it does not alter the entire dev landscape. You don't have to understand the internals of everything you use, or you'd never get anything done.

Perhaps another way to come at it is that Angular is an overlay with which you do general dev. It's an overlay of pure JavaScript. You should know the basics of pure JavaScript first. You should be able to implement basic logic in pure JavaScript first. PHPMailer is an overlay of SMTP, but it does a very specific task. There is no reason for you to implement anything that specific. If you implemented everything yourself, you'd never get anything done.

Another way: your current goal is to write a web form and get email notification of the entry. You should understand a basic web form and be able to modify it yourself, even if you copied the code. You should have fluid control over your core product--the web form. A web form is very general and can have many field and do many things. PHPMailer sends emails. If it "just works," great. It's not the core of what you're doing.

Ideally, "from scratch" means you typed all the core code yourself, or you copied it from yourself. You may be typing it yourself but looking every single detail up. The next nuance that is good enough is that you copied it from someone else but you come to understand it well enough to modify it. Then the next time you are copying from yourself.

Going back to the Bootstrap CSS example, one problem with importing the entire Bootstrap CSS is that it formats every p, div, li, tr, td, etc. You wind up trying to override it and otherwise fight it. I addressed this roughly 2 - 3 weeks ago below when I talked about PHP date formats. The entirety of Bootstrap.css is huge. I whittled down to what I wanted and it was a tiny fraction of the whole thing.

Another way: all of the "bad guys" above are very general libaries or systems or whatever that put a layer between your code and the fundamental code below it--whether that's PHP, JS, CSS, HMTL, or whatnot. You don't want to distort your dev environment like that until you at least know what a pure environment looks like.

January 11 - 12, 2022 - sending email "programmatically" (maybe done at 01:39 Jan 12, my time, UTC -5 / New York / Atlanta)

Continuing again with my web dev Q&A...

One lesson is that I realized below that my stack trace in the kwas() example revealed my username by way of revealing its path. I removed that part, but it's a lesson in security. Knowing my username should not matter too much, for a number of reasons, but there is no reason to reveal it, either. It's good to consider such things for cases in which it does matter.

As of the Jan 12, 00:27 version, I have reworked this somewhat since anyone last saw it. For one, I moved the section on email providers up above the details of the PHPMailer class.

My first comment goes back several entries, including an indirect reference in my Jan 9 entry: DO NOT CLOSE PHP TAGS UNLESS THE CONTEXT DEMANDS IT! I will try not to boldface and put that in red, but we'll see what happens. Just after your mail(...) function, you close the php tag. In your code snippet, there is no HTML or hint thereof, so there is no need to close the PHP tag.

Regarding the mail() function, what is the date on that code snippet? Using the mail() function has become more and more problematic over the last several years. I hope no one is posting that recently.

As for the gory details of the mail(...) function: as I read the documentation, I'm somewhat surprised that I have to read quite a bit before getting a hint as to your problem. I know what you're problem is, more or less, but I'm looking at it as if I didn't.

To take the problem in parts: this is a case where kwas() would help. Also, I mentioned that you usually want to be able to run code in CLI mode for debugging purposes. There is what happens when I use kwas() in CLI mode, and something similar would happen in "web" mode and kwas().

First, the code, then running the script, below, and here is an active link of the following: https://github.com/kwynncom/kwynn-php-general-utils

<?php
require_once('/opt/kwynn/kwutils.php'); // a clone of https://github.com/kwynncom/kwynn-php-general-utils
// kwas() is current defined on line 50, but that of course is subject to change
kwas($res = mail('bob@example.com', 'testing', 'test'), 'mail() failed - Kwynn demo 2022/01/11 21:58 EST / GMT -5');
exit(0); // I am simply emphasizing that this is the end of the code -- do NOT close the PHP tag!!!!
        

Running the script--the same thing happens in NetBeans CLI in the Output / Run window:

$ php mail.php
sh: 1: /usr/sbin/sendmail: not found
PHP Fatal error:  Uncaught Exception: mail() failed - Kwynn demo 2022/01/11 21:58 EST / GMT -5 in /opt/kwynn/kwutils.php:51
Stack trace:
#0 [...] mail.php(4): kwas()
#1 {main}
  thrown in /opt/kwynn/kwutils.php on line 51        

I'll come back to this. It occurred to me that nothing says you have to use kwutils in its entirety. There is an argument for using your own equivalent step by step as you understand the consequences. The two points that I want to emphasize are kwas() and the two functions where I change the error handling such that notices and warnings, and whatever else becomes an exception. Those two functions are my own kw_error_handler() (current line 69) and set_error_handler() which is a PHP function on line 77.

Back to the error at hand, a related technique to "kwas()" would be to note that the mail() function returned false. You'd have to assign a variable to the return value to see that, though:

<?php
$mailResult =  mail('bob@example.com', 'testing', 'test');
if (!$mailResult) die('mail() fail'); // kwas() does the same thing with less lines, vars, and chars :)     

Also, in web mode, /var/log/apache2/error.log does show the error: "sh: 1: /usr/sbin/sendmail: not found"

You mentioned PostFix. It may or may not install sendmail. I don't remember PostFix' relationship to sendmail. With email, there is both incoming and outgoing. Even if you got sendmail installed, though, then there is the matter of configuring it. I'm not sure I ever got that working right. I got incoming sort of working, years ago.

Even if you got sendmail working and the email got farther in the process, you have another big problem or several related ones. When you use sendmail, it is going to (try to) connect to the server of the domain name of the recipient as specified in the MX DNS entry of the domain name. Let's say that's gmail.com. GMail may not accept the connection at all. Years ago, sendmail would have worked fine, but then spam came along, and then SSL came along. And then domain name keys came along, and related stuff around email and validating email. Even if GMail actually accepted the email, it would send it to the recipient's spam box unless you did a LOT of work. The work would involve DKIM and whitelisting and God knows what these days.

So the mail() / sendmail path is a very steep uphill battle. I've never tried fighting it very far. I have also been bitten by this exact problem in a real, paid project. It took until roughly 2 years ago to start causing bigger and bigger problems to the point of total failure. Before that, there were spam problems.

Far be it for me to call something like getting sendmail working a Quixotic quest. I have made some motions to that effect. However, in terms of an actual real-world solution, even I have ruled it Quixotic.

I have used 3 solutions in the real world. All of them involve the PHPMailer class that I address further below. First, though, you have to decide on an email sending provider, unless you want to fight the aforementioned uphill battle.

email service provider (sending email)

As I rework this, I realize that all this is just sending email. That is your immediate problem. I'm not even going to address receiving because I'm not happy with my solutions.

I said that I have used 3 solutions, all involving PHPMailer. I do NOT recommend this first one, but I want to address it because it shows you historically how things have gone along different paths, and it gives you basic info before getting somewhat more complicated.

If you wanted to use GMail to send, there is at least one hoop to jump through even with the not recommended path. If you want to do it the 2010 way, you have to change a setting in your overarching Google account. (I thought you had to specifically turn on SMTP, but perhaps not. I am probably thinking of IMAP and not SMTP.) You have to set your overarching Google account's security allow "Less secure app access." With this option, you would do that to avoid the infamous OAUTH2. I'll leave it at that short sketch because I don't recommend it anyhow, for several reasons.

I used that above option in the real world until several months ago. One problem is that Google will eventually and intermittently cancel the "allow" option. It's just not a viable option anymore. The next option, which I still don't recommend, is to use GMail with the infamous OAUTH2. I started doing that a few months ago when I stopped using option 1, so I am currently doing it. There are a variety of problems using OAUTH(2), however. I'll mention it as a possible option and then skitter away from it because it's a pain. I have a specific reason for using it right now, but I'm still on the fence as to the cost-benefit. In your case, I would strongly consider option 3:

Here I will propose something that may be mildly surprising or very surprising. I like both free as in speech and beer, but in this case I'm going to recommend a paid option, although it's almost literally dirt cheap for our purposes.

Yes, it's tempting to use Big Evil Goo for free as in beer (where you and your data are the product - TANSTAAFL), but it is a pain. I would probably step you through it if you really wanted to, but it borders on Quixotic even for me.

So I use AWS' Simple Email Service (SES), even for my own, non-paid-project notifications. The cost is so low that I don't think I've actually paid a cent for it even though I use it with a paid project. The project emails ~4MB files that add to the very, very low cost calculation. The price is something like 1 cent per 100 emails or maybe even 1,000 emails plus 1 cent per 100 MB of size, or something like that.

For purposes of being thorough from a tech point of view, MailChimp Mandrill is an equivalent service. I am almost certain MailChimp has gotten into the deplatforming / censorship game, though, so I don't recommend them on that basis. I did some testing with Mandrill years ago when it was free, but I also can't recommend it beyond roughly 6 - 7 years ago because I haven't used it since.

SendGrid is another alternative. I would not say I have used it so much as I have seen it used, but that was over 3 years ago.

Getting back to AWS SES, I need to add another few steps. You create a user in the SES screen. That user includes the username and password that you'll use in PHPMailer. Note that the user you create in the SMTP screen is an IAM user, but you do NOT want to interact with that user in the IAM screen, as I further address below.

Also note that the PHPMailer username is not the IAM user with dots in it (by default). The email PHPMailer username is of the form AKIA3ZN... continuing with several more uppercase and numbers. As the instructions tell you, you only get to see the password or download it once upon creation. Otherwise you have to create a new user, which is no big deal, but just to save you frustration. Note that I have found that renewing the credentials of an SES user in the IAM screen does not work. If you want to change the password, just create a new user in the SES screen and change both the username and password. If you change just the IAM password, you get silent failure. That is, you get silence at first glance. I never even set the debugger on it to see when or if the "silence" ends. I just went back to the SES screen rather than the IAM screen.

Another small potential problem with AWS SES is that you STILL have an issue emailing to arbitrary users--yet another layer of spam protection. By default, when you start using AWS SES you are in "sandbox" mode. In sandbox mode, you send a potential recipient an email from an SES screen, and he clicks an activate link. THEN you can email that address.

The SES screens list the port number and SMTP server and SSL / TLS / whatever settings, too, and they are in my code I mention below. Once you have a username and password and approved recipient, you're getting yet closer to actually, like, SENDING AN EMAIL. Amazing, huh?

PHPMailer class and composer

All of my solutions involve the PHPMailer class. I install it with the "composer" command. "composer" itself is mildly irritating to install, as I remember. You can start with "$ sudo apt install composer" but I'm not sure it's going to work. This is one of the roughly 20% - 30% of cases where "apt" / Aptitude is either not the entire solution or the recent-enough package just doesn't exist for Ubuntu / Debian. See what happens. This is a case where I can probably help quite a bit. Yes, the solutions are of course out there, but I still remember that it was irritating.

Composer is a tool specific to PHP. It's a package management system for PHP (source code) libraries. When you install a composer library, by somewhat circuitous steps it's a series of includes / requires / require_once() that pulls the PHP source code into your own code. That means that you can debug a composer-installed library. I don't think I've had to fix a bug in a composer library, but I have debugged into several of them in order to understand a problem and / or learn about how the library works.

As an aside, I specified that composer installs libraries that are included and can be debugged. That's as opposed to a library / extension that adds native PHP functions. For example, my nano extention is a PHP extension written in C that creates a few native PHP functions. Once it's installed, you simply call "nanotime()" like any other PHP function with no include / require / require_once needed. You cannot debug nanotime() just like you can't directly debug mail() by stepping into it.

Getting back to your original problem, first you have to get composer installed. Then you need to decide where to put composer libraries. I use /opt/composer I had to create the "composer" directory. Then note that composer wants you using a standard user, NOT root or sudo. Therefore, going back to your lesson on permissions, I recommend changing "composer" to be owned by your own user and give 755 permissions (rwxr-xr-x). The world / "other" users need to be able to read and pass through. There is no security issue with reading because the composer libraries are "public" code in the same sense that the "ls" command is public.

Once you have your permissions right, do the following. In my case, it's already installed, so your results will be different:

/opt/composer$ composer require PHPMailer/PHPMailer
Using version ^6.5 for phpmailer/phpmailer
./composer.json has been updated
Running composer update phpmailer/phpmailer
Loading composer repositories with package information
Updating dependencies
Nothing to modify in lock file
Installing dependencies from lock file (including require-dev)
Nothing to install, update or remove
Generating autoload files
5 packages you are using are looking for funding.
Use the `composer fund` command to find out more!  

Unfortunately, you're not quite done with your intro-to-composer experience. After I just finished saying above that I wanted to emphasize 2 parts of kwutils, I need to add a 3rd. ("Our three main weapons are fear, surprise, ruthless efficiency...") Once again, you don't have to use my kwutils, but you need to know how to use composer. If you go grubbing (grep'ing) around in kwutils, you'll see I do it like this... Actually, this brings up an interesting question for you. If you use the whole kwutils, I think PHPMailer will "just work" once you have it installed under /opt/composer. Let's see...

<?php
require_once('/opt/kwynn/kwutils.php');
kwas(class_exists('PHPMailer\PHPMailer\PHPMailer'), 'class does not exist');
echo('OK' . "\n");

Yes, it just works. If you think that the 'PHPMailer\PHPMailer\PHPMailer' syntax is one of the weirdest things you've ever seen, I agree. It gets into PHP "namespaces." I understand the concept, but I have barely studied them and have barely attempted to ever actually use them for my own code. One of the lessons I like to convey to apprentices is that I am very far from all-knowing, even when I should be a PHP "expert."

There may be "gotchas" just with require_once'ing kwutils. Maybe you'll find out. Either way, you should still understand what's going on behind the scenes:

<?php
set_include_path(get_include_path() . PATH_SEPARATOR . '/opt/composer');
require_once('vendor/autoload.php');
if (!class_exists('PHPMailer\PHPMailer\PHPMailer')) die('class does not exist');
echo('OK' . "\n");   

That works. As for actually USING PHPMailer, that is yet another step. Isn't this fun!?! Actually, in terms of something that should be simple like sending email, this is one of the harder tasks I've had over the years. Be happy that you're learning from my experience. :)

So, with that said, here is another decision point. I have created my own email class to use PHPMailer. There are most certainly "gotchas" on that--that is, if you use my class precisely, you have to set up the credentials like I did, and there are probably other gotchas. Hopefully I give instructions. (It's been long enough that I don't remember.) And if you want to do it "your way," that's fine, too. Also, I just created a web form with email notification a few days ago. Yours does not have to be that complicated. You can just use an HTML "form" for now. I get all fussy about save-on-edit (AJAX) because it was a specification of my main client. It was a lot of work to implement such that I'm still perfecting it.

Actually, to digress again, the save-on-edit went in 2 phases (so far). For the most part, I got it working several years ago and that is still working. Months after one of my revisions, we learned the hard way that my solution lost way too much data in some cases. I never did figure out what the "cases" were; I just reconceived and rewrote part of it. This problem wasn't catastrophic but it was of course annoying. I rewrote the one field that was causing problems. Since then, it has worked to the point that my client hasn't reported any more problems. I have reason to believe that small bits of data are still being distorted, but it's obviously not critical. Obvious because nothing bad has happened in a long while.

Because I got tripped up over that, I've kept whittling on my save-on-edit technique. I will probably rework it yet again with my main client in the next few weeks, as I partially rewrite the whole application to escape from Drupal and be compliant with PHP 8.0.

Back to your email problem. As for PHPMailer, you have my examples, and there are plenty more examples out there. I'm going to try to wind this down.

ALL THAT is to say that email is no longer easy because of nearly 30 years of spam wars.

January 9

Cue Rage After Storm's "*autistic screeching*" that I address at some length in my new personal blog. Several days or perhaps a few weeks ago I address php tags and the infamous output before headers issue. Now I can quote it precisely because I encountered it again, "...cannot be changed after headers have already been sent." I'm not sure that was the exact wording I saw many years ago, but it's close, and it's the same problem. In this case, the exact quote was "ERROR: kwutils.php LINE: 201 - session_set_cookie_params(): Session cookie parameters cannot be changed after headers have already been sent /opt/kwynn/kwutils.php" For the record (again), /opt/kwynn is my clone of my general PHP utils file and repo. Note that the link is to a specific version of the file--the relevant one.

I felt like "*autistic screeching*" when I saw that. The good news is that now I know what to do.

I'm going to get lazy and stop linking stuff. You'll see changes in my GitHub in at least 2 repos in the near future. I'm writing this 1/9 at 00:26 my time. The short version is that you call a parent PHP file as the link target and then require_once() the template. The session stuff goes in the parent file before the template is called.

2022, January 5 (PM) - at least 2 entries

"side exit" from the shopping cart (16:59)

Continuing again the web dev Q&A...

The principle of "do something" includes taking "side exits." It's fine to divert from the shopping cart to do something simpler with a database. Any understanding you gain is "doing something."

MySQL became MariaDB...

...and "Istanbul was Constantinople."

I should have thought to mention this earlier. If you take the relational route, MySQL became MariaDB. For Ubuntu installation, I *think* all you need is sudo apt install mariadb-server
In case it helps, I list what I have below. The one command above should kick off the rest, though. You'll need to download MySQL Workbench directly from Oracle, though.

   apt list --installed | grep -i maria
[...]
libdbd-mariadb-perl/impish,now 1.21-1ubuntu2 amd64 [installed,automatic]
libmariadb3/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-client-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-client-core-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-common/impish-updates,impish-updates,impish-security,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 all [installed,automatic]
mariadb-server-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-server-core-10.5/impish-updates,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 amd64 [installed,automatic]
mariadb-server/impish-updates,impish-updates,impish-security,impish-security,now 1:10.5.13-0ubuntu0.21.10.1 all [installed]  

To add to the confusion, MySQL is still being developed, as far as I know, but when Oracle bought the MySQL company, the open source community forked MySQL into MariaDB. When people speak of MySQL these days, they probably mean MariaDB in most cases, or perhaps 85% of cases.

2022, January 4 - 5 (AM)

entry 2 on the 4th then into the 5th - sessions, etc. (into Jan 5 01:10)

Regarding sessions, my update to my pizza code gives an example. I'm only using a handful of function from /opt/kwynn, so you can either extract them or use my whole utility. A reminder that I addressed this at some length days ago. Some of the usage in my little functions are very hard won information.

The session ID returned by my function keeps track of one user. Behind the scenes, a cookie is going from server to client and back. Keeping track of the session is really that easy. You can just call my "start" function every time because if the sessions is already started, my function will return it. Perhaps my function needs a better name, in fact, or an alias.

The cookie goes back and forth in the HTTP headers. You can see both the headers in the network traffic and the stored session on the client in Control-Shift-I as in India.

The session ID is very helpful, but it's only a portion of the shopping cart code. You addressed some of the rest of it in your other questions. I'll come back to them.

You asked about echos within a PHP / HTML file. In any entry roughly 10 days ago, I suggested up to 4 layers of PHP code from back to front. The echos go in the front most layer. An example is my user agent template file. The variables are being build deeper and deeper and come out with the echo.

More generally, a .php file can be all HTML, all PHP, or both. If there is no php tag, then the HTML is going straight to output--straight to the client / browser--from top to bottom as any other code. When there is a php tag, that code is run, and any HTML below isn't run until the php tag ends.

You can even do conditional logic on the HTML. You can surround the HTML with { } of an if and conditionally output the HTML. I have an example of that somewhere. Remind me to find it if you don't find one.

Whether you can do the same thing in JavaScript is both simple and more complicated. The short answer is yes, but if the data is coming from the server, then it still has to get to JavaScript somehow. But yes, you can do the same things in PHP (server side) or JavaScript (client-side) with the cavaet that the data has to get to the JavaScript. I discussed this at some length "below" such as when I discuss using one big JSON rendered by JavaScript versus writing the HTML in PHP.

How the client and server interact is a big question in that there are at least several good answers.

You mentioned clicks in JavaScript. Yes, detecting clicks and what was clicked and what the click means more or less has to be done in JavaScript, or at least it makes more sense. You mentioned writing to a local JSON. Note that client side JavaScript can't write to an arbitrary local file. JavaScript is very limited for security reasons. There is "local storage" in JavaScript, but I'm not sure there is a point in using it in this case because everything has to go to the server anyhow.

As I mentioned several days ago, I tend to think you want to account for the user moving off the site and then coming back to it, so the cart should primarily live on the server keyed by the session ID. With some exceptions and alternatives, JavaScript data is lost when the user clicks away from the page.

Getting back to the cart more generally, it's probably time to start learning about databases. If you want to punt that, you can save things to your own files or whatnot. You are probably correct that the basic cart concept is harder than you thought. You'll have to learn about client-server interaction and databases, and that's before you do the payment / checkout.

I should probably bang together the simplest shopping cart I can manage--perhaps 2 items with arbitrary quantities. I assume you're done for the day, though. I'm not sure I can be so inspired if you're going to bed. Also, you might need to back up and do some more general playing around with a database. MongoDB would make my life easier. I could live with MySQL, but it would cause grumbling on my part. Relational databases are so 1990s. If you install MongoDB, I recommend Robo3T as a GUI.

Yeah, as I think about it, learning basic database stuff and "hello world" both for the command line and programming databases is probably going to be a detour for the shopping cart. We'll probably do a lot of back and forth on this. For now, I'm not sure how helpful it would be for me to create a shopping cart.

Need a database for the shopping cart?

Do you need a database for the shopping cart? The short answer is very likely yes. The longer answer is something I hinted at above and you did in your email. You mentioned the shopping cart as a JSON file. Yes, the cart can be a JSON file. It's somewhere between impractical and not particularly sensible to save that JSON file on the client side, but you could save it on the server side.

You could do something like that during development, but it's probably time to bite the bullet and learn databases. For one of many examples, if you had a bunch of shopping cart files, it's harder to answer the simple question of "How many orders need to be fulfilled right now?" As its name implies, the purpose of a database is to organize data.

which database?

MySQL is not installed on Kwynn.com. MongoDB most certainly is. I only have MySQL installed on my latest local system because I could not quite justify moving my main (but part-time) client off of it until a few weeks ago. Now I am moving him off of it, but that's in the present tense. I will be happy when I can delete MySQL from my system.

Part of my pitch for Mongo is that you've mentioned a JSON file, and that is one lovely thing about Mongo--you esentially toss JSON files right in the database. To get it into MySQL in a logical format is a lot more work.

With that said, you pointed out that all the PHP examples you've seen so far are MySQL. MongoDB works in PHP just fine, but I very likely am in a (small? tiny?) minority of PHP users. I assume the common PHP stack is still LAMP (Linux Apache MySQL PHP). Mongo shows up in the MEAN and MERN stacks (MongoDB, Express, Angular / React, Node.js), to name a couple.

On one hand, I leave it as an exercise to the apprentice to research trends in relational versus OO DBs. On the other hand, I can be reasonably sure that MySQL in particular isn't going anywhere anytime soon. There may or may not be a slight trend towards OO, but it must be slight at best.

This is yet another instance of "do something." There will be grumbling from me over MySQL, but just your point about examples is an argument for starting in that direction. (All of my GitHub code is MongoDB, but my code was not written as a tutorial.) Also, I might drop support for MySQL when I drop / delete the whole thing, but that may be months away. Right now I won't even bother with an estimate beyond "months."

On the other hand, just in the last few days I've started writing some of my Mongo code to be executed from the MongoDB command line starting from PHP. In other words, if you could find good lessons on Mongo, the first concern would be learning it generally from the prompt or better yet from Robo3T. If you can learn it generally, running it from PHP is almost identical to running it from Robo3T, now that I have libraries that do that more and more easily.

I'll stop there and see what you come up with.

entry 1 - responsiveness

Continuing the Q&A with my apprentice, today regarding "responsiveness" and such:

I mentioned the JS and CSS refresh issue "below." To recap, several options: you may have to put the code you're actively working on in the HTML file. Then a refresh should work. Remember that your CSS and JS files can be blank or commented out, waiting in the wings for the cut and paste back to them.

If you give the site a unique URL query http://localhost/?blah=1, then ?blah=2, etc., it might help. Also, Firefox will emulate a mobile view to a large degree. When you hit Control - Shift - I (as in India), there is a screen resizing icon on the right middle / at the top of the dev tools screen.

I'm sure there are other ways to solve the JS / CSS refresh issue.

Also, you may be misunderstanding the point of "responsiveness." Ideally the exact same CSS works for both. Ideally it's not a matter of detecting the type of device but writing fully dual-use code. For example, tiles in a grid will settle to 2 rows of 10 columns on a big screen, but will be one column on a small screen. A small number of the tools I use all the time are specifically meant for mobile. For that, I live with cartoonishly large text on a desktop. Keep in mind sizing by vw and vh--viewport width and height where 100 is full screen in that direction, but you do NOT use the percent sign. You can of course use the percent sign on other contexts. I may use a font size of something like 3vw so that the screen could hold roughly 33 characters width-wise.

With that said, there are probably times when you break down and use the CSS "@media" rule and / or detect dimensions in pixels or such.

I'm sure lots of text has been written on this topic. I am probably not the one to ask for much more, although I may find such issues interesting enough to dig at with you.

2022, January 2

First, an update to my previous entry: For a week or so there was some DARPA LifeLog (Facebook) activity around my very accurate clock web app. Then on the 30th it seems my clock was mentioned in a YouTube video and / or its live chat. I haven't tried very hard to track this down, but so far I have no idea of any more details. Apparently I hadn't updated my web logs locally when I wrote that blog entry on the 30th. I thought I had. Anyhow, it looks like somewhat over 100 people were playing with my clock roughly mid-day on the 30th my time. I have some indication that some came back the next day to count down New Year's in Australia--based on time and a few IP addresses I looked up.

With that, back to the dialog with my apprentice:

Even if you did use NetBeans, it would probably work with Java 11. Given that you're not using NetBeans, no, there is no need to mess with Java in any way.

As for my utilities that I clone as /opt/kwynn: just as with anything else, I don't see a big reason to "activate" / require_once / include them until you need them. You may need them soon, but we'll cross that when we come to it. I will offer two caveats, though, which should also re-emphasize two things I said a few entries ago:

When you're in CLI mode, sometimes you'll see PHP warnings from stderr that NetBeans colors in red. You may see them in web mode / HTML, too. As I mentioned, I changed my handler to kw_error_handler(), and it treats notices and warnings just like errors. After doing this for over a year on my own projects, I am sure it's the right answer. I couldn't do it in Drupal because it was throwing way too many warnings. Now I am doing it in the new version of my steady (but always part-time) paid project, so it will get battle tested more and more.

Perhaps this is too subtle a point for the moment. Also, I suspect PHP 8 does this to a degree--treats more issues as errors rather than a notice or warning. When and if you encounter not-quite-errors, though, keep this in mind.

Also, I have never regretted kwas() all over the place, and that has been battle tested with my paid project. kwas() lets you easily check your assumptions. The first argument is either truthy or else an exception is thrown with the (optional) message in the 2nd argument and an optional error code that I had forgotten about until a moment ago when I looked at the function definition again. Once again, this might be somewhat subtle, but you'll probably figure out how to use it soon.

2021, December 30 - web server log analysis - 3rd entry today

Below are 3 (or more) entries in my apprentice web dev "series."

In the last several weeks I have done more work on my web server log analysis. I'm back to the question of how many human beings read my site, as opposed to robots? Of those human beings, what do they read?

At least 79% of hits to this site identify themselves as robots. See my "user agent" page. I don't have a definite number, but I would guess that of the remaining 21%, half of those are also bots that pretend to be a browser, although the number may be higher.

Of THAT remainder, my own usage of my site is probably yet another half. So my estimate is that 2 - 3% of hits are other humans.

I can identify likely or definite robots in a number of ways. Including my own robots (that have very legitimate purposes), devs rarely update the fake user agent. If an alleged browser version is over a year old, that's very likely a bot. If it's 4 years old, which I see all the time, that's almost certainly a bot.

At least one bot makes precisely 2 kinds of queries: to my home page and to non-existent pages. It's almost certainly attempting to hack my system by calling non-existent WordPress pages and such.

AWS scans my site to make sure it's up. I can tell by the IP address ownership and because it only reads my home page.

A bot will fetch many HTML pages in a second. Humans don't do that.

I'm sure I'm missing a few.

Of the likely humans, I seem to have some "engagement" in that they move from page to page, but not a whole lot of engagement. In 11+ years of this incarnation of my site, something like 5 people have contacted me only based on the site.

This might all bring up the question of what's the purpose of having a site. The first answer is that I use it all the time. I have a number of tools I wrote that I use all the time. That's another topic, though.

Myself aside, I would probably keep it up, but that's also another discussion, perhaps for later.

Regarding human readers, recently I'm trying to figure the chance that my site will solve my housing problem. My jury is still out. I'll have to do a number of (many) hours more work to keep clarifying the log data, and meanwhile I should be room hunting by more direct means.

There's always more to say on this topic. Perhaps later.

2021, December 23 - 30 - probably beyond - web dev Q&A

This is an ongoing Q&A with one of my apprentices.

December 30 (Thu)

entry 3 - starting 21:59

Consider using the "section" tag when appropriate rather than div. I have starting doing it here, although I'm not totally consistent. I am almost certain "section" is new in HTML5. Note that every section needs an "h" or "hn" or "h1..7" header, so that's one way you know when it's appropriate. I am assuming section and div are otherwise identical in terms of their defaults, but I am not at all sure.

entry 2 - starting 21:39

Upon thought, I decided to publicly answer another part of your email:

To be honest, I have already forgotten the git features in NetBeans. I've pushed code several times since I mentioned it, and didn't even consider NetBeans. Perhaps I'll manage to use it before 2021 is over, or perhaps not.

Regarding styling, I guess I've become slightly more interested. I'd have to think about that quite a bit. Yes, there has been movement towards being slightly to somewhat more decorative, but I'd have to think about all the reasons why.

When you say "phone viewing," I suspect you meant to use another word. Do you mean talking on the phone twice, and offering to go live? That's a long discussion. I really should publicly explain my issue with the phone in some detail. Not now, though.

entry 1 - posted and announced just before and at 21:37 EST

Regarding your blog, it does validate, so that's a great start. I had almost no idea how you did the 1 second color transition. I vaguely knew such things were relatively easy, but I didn't know details. When I went to look at this color "thing," I ran into another pet peeve.

Immediately upon load / refresh, your page is showing console errors--both JavaScript and HTTP. Also, when you click on the main text of either entry, there is another error.

For unknown reasons, Firefox if not all (relevant) browsers get excited about a lack of favicon. Just toss mine or anything else in there for now. Just make the errors go away!

You may already know more than I do, but did you research the "-moz", "-webkit" and "-ms" properties? My understanding is that's for very old, non-compliant browsers. It may also be for very new features, though. I'm not sure what happened with all that. If it works on both your desktop and phone, I'd call it a win and not clutter your CSS with such things.

I am rewriting this part because before making such a fuss I should justify it. I know that by HTML 4.01 if not long before, styling had been removed from HTML itself as in HTML tags. That is, the "font" tag was gone and related tags. Therefore, I suspect that the "emsp" / tab HTML entity is frowned upon by HTML5 purists for the same reason. By using the tab, you are bringing styling into the HTML itself rather than as styling. I think you can use padding-left, and there likely other alternatives. I'm almost sure I've seen your spacing issue addressed by CSS.

Rather than projecting upon others, I will declare myself an HTML5 purist and frown upon it myself. In fact, for the first time ever I will use the HTML frown code. It seems appropriate to use it to frown at another code: I will even style it! There. You have been frowned upon.

Yes, that is an appropriate use of the style attribute rather than "class."

Back to whitespace, don't forget the "pre" HTML tag. There are cases where you don't want the default rules of HTML messing with your whitespace, and "pre" is one of the solutions to that. I use it just below and elsewhere in this page. (It's also useful when outputting numerology values and making them line up with the letters.)

Also, I would add the year and timezones to your blog. They don't have to be in the header, just somewhere. Hopefully those entries will be online for many years.

On that note, you may have noticed that my websites suffers from a variant of the Y2K issue. I started this incarnation of my site in 2010. I remember thinking about it at the time and deciding that my life would almost certainly be very different by 2020. Certainly I would not be pecking away at the same file hierarchy. The joke is on me. With that failed assumption, in URLs I numbered the years 0 for 2010, 1 for 2011, ..., 9 for 2019. Then I had to use 20 and 21 and very soon 22. This of course throws off the ordering of digits:

/t$ ls
0  1  2  20  21  3  4  5  6  7  8  9

THAT is annoying. Just a cautionary tale.

caching revisited - especially CSS and JS

Firefox and possibly others can be very annoying when it comes to caching / refreshing external CSS and JS. Often times I have given up and put the JS and CSS back in the HTML page just long enough to get it to refresh. I mentioned this several days ago that it may be worth quite a bit of coding to check JS and CSS refresh. I think you misunderstood what I meant because you had a file with the date in it. I meant using server-side code to get the filesystem date of all the files involved. Then you'd have to modify the JS and CSS on the server side and then run JS to make sure the changes are made. All of that is probably going too far. Some much easier things that might work are:

Click the JS / CSS link in the "view source" or debugger and hard refresh it. Make sure that is refreshed. Then hard refresh the HTML page. I think that almost always works. Sometimes adding a (literally?) random URL query makes the browser think it's a different page, and that works, such as blah.html?refresh=1. But then you sometimes have to keep incrementing the numbers. When you turn the cache off in Drupal, the query becomes a UNIX Epoch timestamp for that reason.

The situation on mobile can be so bad that you have to put JS and CSS that is under heavy dev in the HTML. Remember that you can make your page a .php rather than .html and simply "include" the styling. For that matter, you can write PHP code to switch back and forth between internal and external.

Remember that you can see caching or lack in the network traffic. I just refreshed your blog, and I had to do a hard refresh to get the CSS to go over the network again. I don't think it showed me explicitly that it was caching, it just didn't show the CSS going over the network. I think you can see the caching itself somewhere.

caching generally

I have suffered many issues over the years with caching in various contexts. I have harsh words for developers who don't make it very easy to definitively turn the cache off. Furthermore, caching is way overused. As 2022 approaches, if you can't make your process work in milliseconds without caching, there is probably something else wrong.

Firefox has a legitimate reason for caching because zillions of people use the browser and thus caching has saved an enormous amount of CPU time and network traffic over decades. However, Firefox should still have a way to definitively turn the cache off. Maybe it does, in fact. I think I've looked, though, with no luck.

back to your blog

When you talk about text wrap, I think you misunderstand what the flex box does. The flex box wraps entire divs. Whether text wraps is a separate issue.

As for centering an image, I have very little advice. Sometimes the "auto" value comes in handy for such things.

December 28 - 2 entries

8:55pm - starting to write

Note the previous entry, about an hour ago.

Do you need the cookies? As I said yesterday, you do at least for the scenario of someone accidentally closing their window and coming back to the site. You may or may not need them other than that; it depends on how you arrange your page.

How to implement them? It's as easy as I laid out yesterday. If you use my wrapper around the session functions, it's that easy. If you don't use my wrapper, see my caution about not restarting an existing session.

If you're considering doing it "from scratch," I would advise against in this case. Out-of-the-box PHP does the job splendidly. This is a case of "just use it." If you want to do it yourself, I'd save that for months from now. The short version is that the cookie goes out in the HTTP response header and comes back in the request header. There's no reason to mess with any of that now. You will want to see the cookie itself in control-shift-I storage.

As for an SSL cert, I very rarely bother with them on my dev machine. My implementation of sessions allows the session to ignore SSL on my dev machine and / or non-AWS machines. My functions assume live is AWS. I may have to deal with that at some point.

If you want to do it, I recommend certbot by Let's Encrypt. I installed it as a snap rather than an Ubuntu package. You have to register a cert against a domain name or subdomain, so you'll need to route such to your dev machine.

8:00pm (approximately)

We talked on the phone for a while. I was giving a lesson and solving webadmin problems as I walked.

Today's phone lesson was in part about Apache DocumentRoot and setting directory permissions for the www-data user / group.

Some reminders from that lesson... The path all the way from / to document root should have my recommended 710 permission and have www-data group access. Document root itself probably needs or should have 750 permission.

Considering changing everything else in ~ to 700 (dirs) or 600 (files). There is a chmod X tag that does this quickly. chmod can be used with the bitmask or "letter" flags and pluses and minuses.

If you ever figure out how to change the default such that files and dirs don't get such wide permissions, let me know.

Going back to the email exchange, he said he's going to try the JetBrains WebStorm IDE / debugger. Apparently it's proprietary, but he has a free-as-in-beer license from college. (I of course use free as in beer and speech Apache NetBeans.) This is a case where following Kwynn's Rule #1 is more important than open source versus proprietary. As long as he's using a debugger, I will try not to further comment.

> i'm going to hold off on using the integrated git vcs because i want to continue to learn the command line way of doing things and get familiar with that.

Agreed. With that said, I've been vaguely noticing that NetBeans has this stuff. Now that you mentioned it, I looked harder. It hadn't gone through my head that NetBeans has all the commands. For the usual tasks add, commit, push, I think I've got that down at the command line well enough that I may try out NetBeans' commands.

December 27

Starting from one of yesterday's emails, a flex box grid sounds good. I have found it useful. Perhaps some other time I'll do a recursive search on my web tree and find all the instances where I use it. I've considered giving you a copy of the site, in fact.

To various questions from both yesterday and today about the shopping cart and client versus server side... You'll probably want to use PHP sessions, which is a cookie with a unique ID. The PHP functions do it all for you, though, in terms of creating the id and managing the cookie. In kwutils.php, see startSSLSession(). This is at least one big "gotcha" that I solved with that function: the session functions get ornery if you start a session when there is already a session. In fact, doing so might lead to the horrible beast that goes to the effect of "output before [HTTP] headers." Which reminds me:

the output-before-HTTP-headers issue and intentionally NOT closing PHP tags in most situations

The following is a counter-rule to every other situation in programming. In all other cases I know of, if you open a tag you should close it. Do NOT close a PHP tag ?> unless the context demands it!

That is, unless the PHP is switching back and forth with raw HTML, you don't need to and SHOULD NOT close the PHP block / file. Look at my code. I am almost certain I am 100% consistent about this. I would be surprised if you found a counterexample in my GitHub.

The problem is when you have an included file in a mixed-PHP and raw-HTML file. In the included file, if you close the PHP tag and then hit a newline or even a space, that is considered raw text and will be outputted because it's not PHP. If it's an include file or otherwise outputted before the HTML itself begins, you'll get the "output before [HTTP] header" error.

That error indirectly led me to quitting two projects around the same time, many years ago. I spent a lot of time chasing that issue around. I think it took me months of calendar time to figure out what was causing it. And the way it happens is insidious. It's like a virus that seems to pop up at random. Those projects may have gone on for quite some time or even indefinitely, so that little bitty issue may have caused me an enormous amount of money. That's not even the situation where violating (what is now) rule #1 cost me even more, potentially. I'll come back to that.

back to sessions

So, when using sessions, make sure to return the session_id() if it's truthy (sic) rather than trying to restart the session, as my function shows. Then that function calls another in the same file that forces SSL. In your case, you'll (also) want to do the standard Apache rewrite that forces SSL anyhow. You'll want to do that because you're starting from scratch. I am afraid to do it for Kwynn.com at this point. It's on my agenda to thoroughly test it. Perhaps I'm being paranoid, though.

Once the session starts with session_start(), every time a PHP file is called from the client, the session_id() will give you a long-enough unique string. That will help with the shopping cart and otherwise keeping track of what a specific user is doing.

The PHP session functions create a cookie named PHPSESSID. It's 26 characters. I'd have to experiment to be sure, but it looks like a regex of ^[a-z0-9]{26}$ So 36 to the 26th power is 10^40. I think that'll suffice.

server versus client calculations

As for whether to use the client or server for calculations, you MUST at least check the calculations on the server for reasons you mentioned in one of your emails and I mentioned in the last few days in this blog. That is, if you rely on client data, a malicious user can change the price and thus total cost.

With that said, it's a toss up whether to do the initial calculations on the client side or server side. I tend to think it would be tedious to do every calculation on the server and send it up and down. It depends on several things.

"Where are the cart items stored?" If you are using one HTML page, in theory you can store it only on the client side until checkout. However, you should allow them to close the page (perhaps accidentally) and come back to it with the same session ID. (Sessions can last for years or decades, in theory.) Thus, the cart items should be sent to the server on each click and put in a database, keyed by session ID and perhaps other keys depending on how you're arranging the data. "Key" in this case means unique index or the fields that uniquely identify a row in relational or a document in MongoDB (object oriented DB).

You spoke of an unordered list in JavaScirpt. I have a few guesses what you mean, but I'm not sure. You can keep the cart in JavaScript as a global variable of the object type. Generally speaking globals are frowned upon, but this is a reasonable use case for them. The MongoDB database entry can be the JS variable to a large degree, other than the Mongo version will have the session ID and perhaps some other added fields. Remember that if you go to a new page, you've wiped your JavaScript, so you'd especially need to make sure the server had the cart by then. (Again, the server should probably have it upon every click.)

> [">" indicating apprentice's words] the button executes Java script and that takes the data from their selections and stores it in a shoppingCartTotal

Close. Perhaps more like global variable object GL_RO_APIZZA_CART is the entire cart, with or without the total. You may or may not want to save the client side total in a variable as opposed to displaying the calculation each time. That is, the number and type of each items needs to be in the cart, but you don't need to keep track of the total in a variable. This also goes to a larger issue of when you store data that can be calculated. (I don't think I'll elborate on that for now.) Which way to do it will probably occur to you when the time comes.

abstraction

> - my mind is abstracting away a lot of the data but I’m sticking to your principle of just getting something built. I can see how designing a template based system would be appropriate if I wanted to expand further with the software and incorporate into other small businesses. (Meaning making generic tiles that scan a database and pull in whatever data is there)

You have the idea. It's a tradeoff and a big question as to how much you want to abstract and when. One of my issues with Drupal and WordPress is that they have abstracted to the point they don't do anything specific well. Decades ago a comedian said, "I went to the general store, but I couldn't buy anything specific." That is part of the problem with CMSs.

So yes, in theory you can have generic tiles and generic interactions and calculations. It's hard to say how far you can take that before it becomes too generic / general.

crypto

Yeah, maybe. Sure. I'd get Federal Reserve Notes working first. (If anyone spends legal dollars at any pizza shop, anywhere, let me know. Legal dollars are still gold and silver minted with 2021 stamps by the US Mint. Paper and computer entries are at best fraudulent promises to pay real dollars at some point in the infinite future. The paper and computer bits represent private script created by the banks against your mortgaged house and other such collateral.)

corporations and legal protection

Note that I'm just an amateur legal hobbyist, so I can't give legal advice. With that said:

I am also currently judgment proof, so I'm not really one to talk, but with that said, I tend to think you're being paranoid. If your system accidentally charges someone $1,000, you return the money via the chargeback process, the same way a waitress would do at a restaurant. When the system first goes live, you should have access to the account for that purpose. You should probably always have access to the account.

As for credit card numbers, you're not storing them. The way PayPal and perhaps every other system can work is that the client pays on PayPal's system and your system gets a "callback" when the money is approved. You never see their credit card. For that matter, you don't need their real name, let alone their email. You just need something to tag their order with when they come to pick it up. This can be a small number if you recycle them often.

I'm curious if you can find cases of individuals or small companies being sued for bugs. At this point it should be legally assumed that software comes with no guarantees. It is said that if buildings were built the way software is written, the first woodpecker would end civilization. One of my professors addressed that. The comparison is simply not fair to us. Builders can see and put their hands on and test and inspect everything. There is no such visibile equivalent in software. We can only do our best within budget constraints.

Also, a not funny story along those lines. One of my brief quasi-apprentices created the type of corporation that fines you $400 per shareholder for filing taxes late. The business made absolutely zero money, and he was already paying fines. I howled laughing at that. I told him it was one of the best examples I'd ever seen of the cart before the horse, to which he (rather foolishly, as I'll explain) said that at least he had a cart.

Especially in the context of his "cart," people seem to forget that corporations (including governments) are not real in that they are not at all tangible. "The government" does not do anything, only people alleging to act for the government. You don't need a corporation to write software. You don't need a "cart." I've never incorporated and never seriously considered it. There was a situation many years ago where having an artificial entity tax ID would have saved me about $1,000, but the cost of creating and maintaining the entity probably would have approached that. I have no regrets.

That's a tax ID as opposed to a socialist insecurity number that refers to an equally artificial legal entity.

You may decide that there is enough reason to incorporate or write a trust. Trusts have the relevant legal protections of a corporation but don't need to be blessed by the government for their existence. If I were to go that route, I would create a trust.

One of my systems has processed something like $500k over several years in a somewhat different context. I did the first $25,000 "by hand" in that I processed each line item while watching it in the debugger (NetBeans) and stopping several times (breakpoints) for each item. I also had 20 - 30 checks, maybe more, to confirm that I was on the right account and only entering what the client approved. Yes, it was very nerve wracking at the beginning. After all these years, though, my "interlocks" and cross checks and such have done their job.

I've had a number of rather embarrassing bugs on much less critical parts of the system. At one point I lost a reasonable amount of data, although it was reproducible. In a rare event, two data-corrupting bugs have shown up in the last 5 weeks or so. One was likely a rather small amount of data lost that is also reproducible. The other might have caused some minor (moderate?) problems. But this project has a limited budget; I can only do so much testing in the areas where big money isn't at stake. With that said, I'd like to think I've learned something from 2 data-corrupting bugs in 5 weeks.

I know a good lawyer in your area, as we've discussed. :)

back to debuggers

I started this blog page 4 years ago in order to state rule #1, so it's at the bottom of the page. The quick version is "never dev without a debugger," as defined briefly just below.

Just to reiterate that Kwynn's rule #1 applies to both the client and server side. A browser's debugging tools can't help you on the server side. A "debugger" means that you can step through the code line by line, see where the code goes, and check the value of each variable at each point. echo(), print(), printf(), console.log(), etc. are not effective debugging tools. They have very limited purposes, and sometimes you can get away with this, but failure to use a debugger might literally have cost me $100,000s indirectly, so now the tale:

why I created rule #1 after the horse burned with the barn

This was years ago and the last time I tried working 9 - 5. I was working in Ruby and thus didn't know of a debugger. Quick searches didn't turn up any free ones. I don't remember if there were proprietary ones; in hindsight, $500 would have been worth it. I tried debugging with whatever Ruby's print() is. In part because I was so tired, I kept chasing my tail. Part of the problem was that they were using Heroku or something of the sort, which I didn't fully understand. The code was initiated from a worker process callback. A debugger would have brought that to light much faster. I never did solve that bug before I got tired literally beyond reason and quit

back to debuggers, again

Writing code in gedit and going into NetBeans just for debugging is perfectly fine as long as you aren't hesitating to debug because you're not already there. Also, NetBeans is better at HTML decoration (such as coloration) than gedit. For one, gedit has a very obnoxious bug that causes it to lose all the decoration when I do "h" tags. I just tried it; that bug is still there.

I'm not set on NetBeans as long as you use a debugger (more options below). I have had good luck with it for many years, though. It has a few quirks, but I can live with them. One quirk in 12.4: it will not kill your code, either in PHP or C. That is, you hit the kill / stop button, and rather than dying, the code will go on to the end despite breakpoints. That is somewhat annoying, but I've learned to live with it, too. I may have to write around that, though, for some code. Also, I have not looked into it; there may be a simple solution.

Years ago I used Eclipse, and I brielfy used Eclipse last year. It works. I'm almost certain PHPStorm is proprietary, but in this case I'd prefer you use proprietary software rather than not use a debugger. I'm fairly sure there are other options.

back to client v. server and security

> which raises a question: can i keep all the data that im building on client side for the check out car and do all my calculations on client side as well, then send those off to the payment processor?

Note that my apprentice had not seen the above before asking this. To reiterate the above: you can do the calculations on the client side, but you must also do it once on the server side to check the paid amount.

> i'm assuming it's bad to keep the prices client side

It's fine to send prices to the client side as long as you check / confirm / recalc on the server side. You only have to check once against the payment on the server side. It's probably easier to do it on both sides. It's tedious to go back and forth with the server, so build the cart on the client side. Then do it just once on the server side.

One thing I didn't mention above is that this is a case for using Node.js as your server side language. Then the exact same code can calculate on both sides. You can use also Node from PHP in at least two ways. In my generic logon / user identity system, I use the exact same code by calling Node as a shell script.

I've invested so much time in PHP that this relatively small issue isn't a reason to go to Node. It might be a reason for you to do so, though. I call it a small issue because it doesn't take long in any language to add up a total. That is, it doesn't take me long. What you're doing is non-trivial for a beginner. I'm sure you'll do some floundering.

I'm still decding on some of the basics of my own webdev. Because certain projects involved Drupal, I didn't have full freedom. Now that I do have full freedom, I'm still working on the best way to do things.

> because someone could essentially change the submission price manually on the cart resulting in bad behavior.

Correct. That is the sort of thing you're protecting against by confirming on the server side.

More generally, any data sent from the client cannot be trusted and you have to consider all the mischief client data can do. So there is SQL injection, injecting JavaScript (or links) into data that may be displayed on the web, and injecting large amounts of data just to run your server out of space. Those are just a few.

In the web contact form in progress in my GitHub right now, I check the format of the pageid. I limit the number of characters. I escape the text when I display it on an HTML page. In other cases, I make sure numbers are numbers. I probably have not thought of everything in that case, but the stakes are not particularly high.

modal

"Modal" is one of those terms that annoy me. They make it sounds like some very special thing. Is this a particular modal library, or is it just an example? How big is the library in bytes? I'll come back to this.

a rant on bootstrap.css

The minimized version of bootstrap.css (v3.4.1) is 120kb. The PHP date format uses this. I once decided I wanted my own copy of the table. I had a cow when I found how big bootstrap.css was. So I started with the "maximized" / dev version and whittled down what I wanted. I count 3.4kb.

This also goes to the issue of being so general that it doesn't do anything specific well. Drupal uses Bootstrap, which is one of many issues I have with Drupal. I have elements' styling overridden by Bootstrap. It's very annoying.

back to modal

Anyhow, "modal" sounds so special, but it's not hard to do yourself. First of all, do you need anything of the sort? Why not just a plus and minus and a text number box for quantity? As soon as they push that button, it goes into the shopping cart.

If you want a modal popup-like effect, you can use CSS z-index and fixed positioning. z-index is a pain to use the first time if you don't have the decoder ring, but it may be worth it in th end. Here is the "flag" example. I thought I had another example, but I'm not finding it with a recursive search (grep -R z-index). The key, as I remember, is that the elements involved must have a "position" attribute rather than the default static. If this gets out of hand, let me know. I'm sure there is another example. Also, I have an example in a client's proprietary code.

December 25

debugging PHP
a debugger

Remember that Kwynn's dev rule #1 is to the effect of "never dev without a debugger." I use Apache NetBeans as the GUI of my PHP and C debugger. Before NetBeans will install, though, you need both a JDK and JRE that I address below. I don't think NetBeans is in the Ubuntu package repositories anymore, so download it directly from Apache. I'm using version 12.4. As best I remember, you download the file and then "sudo bash file.sh" to install it. You run bash because otherwise you have to turn the execute bit on, which you can do graphically and easily via the command line, but just running bash should work, too. You need sudo because it's going to install stuff all over the file tree such as something close to if not precisely /usr/bin and /usr/lib and such.

NetBeans needs a JRE and JDK. Installation notes below. I'm pretty sure I have used higher versions of such than the following, but these work, so might as well install what I have.

I'm going to list what I have and then explain how they relate to the install commands. I'm going to somewhat change the output to remove clutter. There is some chance you'll already have something installed. If so, see if it works before messing around with different versions.

apt list --installed | grep jdk

openjdk-8-jdk-headless/impish-updates,impish-security,now 8u312-b07-0ubuntu1~21.10 amd64 [installed,automatic]
openjdk-8-jdk/impish-updates,impish-security,now 8u312-b07-0ubuntu1~21.10 amd64 [installed]
openjdk-8-jre-headless/impish-updates,impish-security,now 8u312-b07-0ubuntu1~21.10 amd64 [installed,automatic]
openjdk-8-jre/impish-updates,impish-security,now 8u312-b07-0ubuntu1~21.10 amd64 [installed,automatic]

You'll need to install those 4 packages, where the package itself corresponds to everything before the first /, such as "sudo apt install openjdk-8-jre-headless"

Eventually you'll need to install "php-xdebug"

Then you'll need to make changes to both /etc/php/8.0/cli/php.ini and /etc/php/8.0/apache2/php.ini at the very end of the file, or wherever you want; see just below. I put my name in a command to indicate where I started a change.

; Kwynn
xdebug.mode=debug
xdebug.client_host=localhost
xdebug.client_port=9003
xdebug.idekey="netbeans-xdebug"
           

Then restart apache (web server) for the apache-php changes to take effect: sudo systemctl restart apache2

Then "debug" a project inside NetBeans and you should get a green line in your code. Beyond that, I should give you a tour. And / or see if you can find discussion of how to use the NetBeans - xdebug - PHP debugger.

kwutils - very strict notice handling and "kwas()"

You'll note that many of my files begin with require_once('/opt/kwynn/kwutils.php'); /opt/kwynn is a clone of my general PHP utilities. You'll have to play with permissions to install it as /opt/kwynn. You can also of course do it however you want, but /opt/kwynn is probably a good idea if you want to easily run my code.

You and I should probably go over kwutils thoroughly some day and whittle on it. It's gotten somewhat cluttered, but I consider it professional grade in that I'm starting to use it in the new version of my regular (but 5 hours a week) paid project. Also, I've been using it on almost all my projects for about 18 months now.

In the first few lines of kwutils.php, I change the error handler such that notices and warnings kill your program just as thoroughly as a fatal error. I have never regretted this decision. It makes for better code. This may be less important in PHP 7 and 8, but I see no reason to change course. I don't think this would help much with your immediate bug, but it's relevant to debugging generally.

Combined with advice below, what would help is my "kwas()" function. It stands for "Kwynn assert," and I want it to have a very short name so that I am encouraged to use it ALL THE TIME, and I do use it all the time. First of all, in your case, use file_get_contents() rather than fopen and fread and such. I use fopen() very rarely verus "fgc".

kwas() does something like "or die()" but I like mine better for a number of reasons. Your code snippet just gave me an idea I should have had ages ago. I need to test something....

Ok, I just changed kwas() to return a truthy (yes, that's a technical word) value.

So now your code would look something like the following. I'm also going to change your path. The path issue might, in fact, be your problem. Also, if you're not using variables or a newline or something that needs to be substituted, use ' (single quotes) rather than " (double quotes). The __DIR__ is a more definitive way of saying "this file's directory." Simply using "." has issues that I have not entirely thought through. I am not guaranteeing the following will run. I'm giving you the idea. I'll never finish this if I test every snippet.

$path = __DIR__ . '/last-updated.txt';
echo(kwas(file_get_contents($path), "reading $path failed or was 0 bytes"));
            

All this may still leave you with another set of problems, so more stuff:

CLI versus web

Part of the problem you're having is that you're just getting a 500 error with no details. There are several ways to deal with that.

PHP is run in CLI (command line) mode or various web modes. Rather than figure out all the web modes, I have always found that logical "NOT cli" always means web mode. I address this more specifically below.

I mentioned /etc/php/8.0/cli and /.../apache2 So that means that there is a different configuration for each, and thus different defaults. There are several relatively subtle differences in running PHP each way. In case it's not clear, cli mode means "$ php blah.php" and web mode means Apache or another web server is running the PHP.

Generally speaking, you can at least partially run your PHP web files from the command line. In your case, I think you'd see your bug from the command line. Meaning "$ php index.php" or your equivalent. It's a recent practice of mine, so it's not burned into me, but I'm starting to think you should go somewhat out of your way to make sure your web PHP can run as seamlessly as possible as CLI (command line) for dev and debugging purposes. That is, you may have to fill in stuff that would otherwise be filled in from Apache. Running in web mode is somewhat more painful for a number of reasons, so you should leave yourself the CLI option.

kwutils has iscli() to indicate CLI (command line) mode versus web mode. It in turn it is using PHP_SAPI === 'cli' where PHP_SAPI is a runtime superglobal variable provided by the PHP interpreter. I mention this because in order to make your code dual-use (cli and web), you'll sometimes need to use that.

When you have the NetBeans PHP debugger working, you can see all the superglobals and their values.

error.log and error display config

Did you look at /var/log/apache2/error.log ? That probably has the specific error.

By default, web PHP turns off displaying errors because displaying errors (when there are errors) allows anyone on the web to get variable names and such and thus make various injection attacks easier.

Your development machine is exposed to the web, and I'd imagine if you look at your access logs, you'll see that others have already found it. You're running with a 32 bit (IPv4) address, and there are so relatively few of those that bots can find many of them easily enough. (I would not assume that 128 bit (IPv6) is better protection. I'd imagine the hackers have already narrowed down what's in use.)

I mention this because changing the error display even on your dev machine will be seen by the world, and your app will hopefully soon be used in "the real world." We should both give some thought to the implications, but I would err on the side of everyone seeing you err. :) As you see various messages, we should both consider what anyone could get from that. Otherwise put, this is a case of "security by obscurity" probably not being particularly secure.

Besides, this is a small shop, not a bank or crypto exchange. You can almost certainly use PayPal (or others) such that the user's data is not in your system, or at least it's minimally in your system.

With all that said, to turn on errors, change this in /etc/php/8.0/apache2/php.ini:

; Kwynn
display_errors = On
           

Then restart Apache. (The above is line 503 in my file.)

misc audio files as "music" - part 2

This is my 2nd entry and 3rd "h5" header for today. This also gets off topic from web dev, but this is a continuing discussion with one of my apprentices, so I'll leave it here.

This is a followup to non-audio files played as "music." First of all, the "stop" button works for me on Firefox 95 Ubuntu desktop. I haven't checked my web logs to see which user agent you're using. If you come up with a fix, I will almost certainly post it as long as it also works for me. I am not going chasing that bug now. You can add it to the endless list of stuff we might do much later.

As for how it works... Any audio recording encodes a series of volume levels; it's only a matter of how it's encoded. A CD "is a two-channel [stereo] 16-bit ... encoding at a 44.1 kHz sampling rate per channel." (1 / 44,100) === 0.000022676 or 22.676 µs. So, every ~22 microseconds the recording system records the volume of each microphone as a 16 bit number, so 65,536 possible volume levels.

The .WAV may be the original computer sound format. A quick search shows that the original WAV format was the same bitrate as a CD and that SatanSoft once again rears its head. A WAV has a 44 byte header and then the rest of the file is audio encoded as above or else variants of the sample rate and volume bits. For the "symphony," I used 8 kHz and whatever the default volume bits are.

The commands I used to create the WAV are just above the "play" button. I took an Ubuntu install ISO file and treated its bits as sound. (The ffmpeg command added a WAV header.) The result was interesting. It has a beat and an odd sort of music. There's no telling what other files would sound like. I'd imagine people have played with that.

Firefox caching

First of all, remember that you usually need to refresh a page before you see changes. Firefox can be stubborn about that. By default, Firefox does a soft refresh. Control - F5 should do a "hard" refresh, but even that doesn't always do the job. The problem gets worse with mobile browsers and external JavaScript and CSS. Consider putting versions or unique timestamps in all the relevant files to see if the right page is shown. Sometimes changing the query on the page helps refresh it, such as /?blah=1 /?blah=2 etc. The query doesn't have to be meaningful or used, but the browser interprets that as a different page, so it may refresh the cache.

When testing mobile, I have had to put JavaScript back into the HTML page as the easiest way to force a refresh of the JS.

To check CSS, sometimes I change the color of a certain element just to check the version. With JavaScript you can set a version with document.getElementById('versionElementForFileXYZ_JS').innerHTML = '2021_1225_1846_25_EST';

I have never taking the following to this extreme, but I suggest a technique below. Rather than going to extremes, once you're aware of the problem, you can usually eventually get everything to refresh. Also, I'm sure I'm missing options. I haven't gone looking all that hard once I understood what the problem was.

A perhaps too extreme measure would be combined server and client code that checks disk timestamps against what's rendered. For CSS, the server code would create a CSS tag like ".cssV20211225_1843_22_EST" or both human readable and a UNIX Epoch timestamp. Then the JavaScript would do a CSS selector query for the existence of that CSS tag.

W3 validator referer

Update: see my first January 13, 2022 entry. The "referer" generally won't work anymore.

Always point the validator check to https rather than http, such as https://validator.w3.org/check?uri=https://blah.example.com/page1.html. If you try to validate a secure page with an http link to W3, it won't work because the browser will not send a referer from a secure page to a non-secure page.

As to why "/check?uri=referer" works, I think I implicitly assumed for very long time that this was some sort of standard. It's much simpler, though. It's specific to that particular W3 validator tool. Whoever made that tool can write his "?" queries however he wants. It's written such that if you use the "referer" HTTP query argument, the code checks the HTTP request header for the "Referer". Look at your network traffic, and for a .ico or .png or .js or whatnot, you'll see a "Request header" "Referer" field which is a link back to the HTML or PHP page that called the .js file or whatnot. The W3 code reads that referer and thus knows what page to fetch. (control-shift-I and then the "Network" tab shows you the network traffic AFTER you load that tab, so you will have to refresh.)

I wouldn't call it an "API," either. Again, it's much simpler than that.

As for how I knew to link that way, I found the documentation, but I found it because I knew to look for it. Off hand, I did not quickly see that linked from the validator itself. Upon thought, my best memory is that my webdev professor in 2005 showed us that technique. He definitely pointed us to the validator.

As for reading request headers in PHP, one option is apache_request_headers(). I use this in my CMS ETag and modified time test, function exit304IfTS() at the bottom. I think I only implement one of the two so far. It's on my agenda to implement the other.

December 24 - 2 entries (so far)

16:38 EST entry

This continues a discussion with one of my apprentices, so I may switch from "he" to "you" again.

Today's edition begins with a question about a template and pulling from a database versus hard-coding the menu (see yesterday's entry below). He was concerned about loading delays. You'd have to be the average Indian so-called developer to delay loading that much, or a white American who doesn't understand databases worth a darn and uses loops instead of SQL joins. I once had a manager try to tell me that the queries were "very complicated" and thus they took several seconds. The queries were trivial, and the code should have run literally several hundred times faster.

The point being that loading delay in the context you mean has not been a problem on any hardware in the last 10 - 15 years.

You bring up a more interesting point, though. There is always a tradeoff between making data entry easy versus the entry code making the overall system much harder. Otherwise put, how much trouble do you want to go to at various stages of the project to make it easy for the pizza shop folk to make changes? Given my philosophy of "make something work now versus frittering on perfection forever," I would not worry yet about letting them make changes. At the start, you're presumably going to be on hand pretty much every day. Get the system making money, then decide when it's worth making the tradeoff to let them take some of the workload.

With that said, this brings up the question of validating prices on the server side. Say you hard-code $5 as the price of an item. The client orders one of them, but the client is mischievous and lowers the price to $1. You should always check such data on the server side. So this brings up the interesting question of how to encode the price such that it can both be rendered and checked easily. Putting the price in various data formats makes sense: a database, a CSV file, a JSON file, XML, raw text, etc. Then you'd have to do a bit of processing to render it, but you'd have the validation on hand.

17:42 entry

You mentioned a data object, or a DAO: data access object. This brings up a big question that has many possible good answers: how do you go about getting from the database to the HTML? I have gone back and forth between two methods. I give examples of both further below, once I explain them.

I'm about to explain my interpretation or variant of the MVC pattern or framework--model view controller. The model is the database code that works with the data model. You might call this the far back end (server-side). The controller is in the middle and interacts between the other two. The controller might be on the back end or front end (browser client). The view is the code that creates the human readable format including the HTML. The view may be created either on the front end or the back end to a degree, but the end result is part of the definition of the front end because it's the front side that the users sees.

A DAO whose only job is to interact between the database and the rest of the code is a good idea in some situations. Less strict but sometimes more practical is code that accesses the db and does the first round of transformations towards HTML.

Once again, you may want to make something work first, however you can. Even 2 - 3 years ago (18 months ago?), I might make a big mess in terms of the code logic, but the end result worked. Then I started cleaning the code, sometimes. Now I am actually starting to code with my variations on MVC. You can see the step by step progress in git commits.

I've gone back and forth between two variants of MVC. My jury is still out, but the technique I am starting to favor goes something like this... Write 2 - 4 layers of PHP code. One or two layers fetch from the database. The second back-end layer may process the data closer towards the end product. Then you may have a layer that makes the data completely human readable, such as turning float 5.0 to string "$5.00" This layer may also do the loop that creates an HTML string of table data. The final PHP layer can be echo statements embedded in HTML that write the final product.

Let's take my very recent user agent code. "p10.php" is the innermost layer. Often I actually use the term "dao." In this case I didn't, but p10 is serving as the DAO and it's doing the loops that lay out the data in an array that is close to the HTML table format. "p10.php" is the model. "out.php" is the inner view--the part of the view closer to the back-end model. It changes integer 25000 to string "25,000" and has the loop that creates most of the HTML. Then the template.php has "echo()" functions to write the strings.

The other technique is to create JSON at the PHP side and then let client-side JavaScript process the JSON. I did it that way in a previous user agent version.

I think the more recent way is better, but I'll know more when I get back to my long-term paid project. I'm going to have to make that decision soon.

17:56

Regarding an internal "style" tag or external CSS: I totally rewrote my home page yesterday and posted it an hour or two ago. I was running all over the place adding "class" attributes. I find it easier to have the class attribute and the relevant styling in the same page rather than switching back and forth. This may depend on how big the file is, though. For a big file, going up and down is harder. As I said yesterday, one answer might make more sense during dev and another once you're done dev'ing. I'm not making an argument against your point. I'm just explaining my reasoning.

Regarding big files, here is a thought. When you create a php file, it *IS* an HTML file until the <?php tag, like my "template.php" I mention above. One result of this is that you can use require_once() to add HTML fragments. So, with a large file, you can have a central PHP file that calls subfiles to put them together.

December 23

This is in response to an apprentice's question. He is continuing his own version of the pizza shop online ordering.

The following may or may not be off topic. Perhaps it's past time to say that, as far as I know, he has a pizza shop in mind that simply sells pizzas and is not owned by the man who is mysteriously the 49th most powerful man in Washington for owning a pizza shop. (In all these years, I'd never actually seen the text, but there he still is, 9 years later: #49 James Alefantis.) You'll note that I put $5 or $10 on my version, not $15,000 because I'm selling something other than pizza.

In any event, I will try to keep my technical hat on. In his version several hours ago, he had "pizza.php" and "salad.php" and such, activated by clicking each category of the menu on the left side. His asked my thoughts on this.

I'll switch to "you" rather than he. I have to start with my pet peeve. You didn't close a div, so I'm sure your page is HTML5 invalid. Firefox "view source" shows the close body tag as red; that's why I noticed. (I may have noticed by eye soon after.)

I have to appreciate your use of ":hover" and "active" (Why is it :hover and .active? That doesn't seem right, but it seems to work.) Remember that I try to avoid "pretty" web sites, so I'm only partially aware of such things. I'm glad you reminded me because it's a useful cue to the user. I probably use JavaScript in situations where CSS does the job more naturally.

You might consider pulling your styling into the one HTML page during parts of development. There are arguments either way. I find it useful to have everything right there. As you head towards going live, it probably makes sense to have an external style sheet, but I still argue with myself about that. I'm not sure there is one right answer, either. You can cut and paste your CSS between the two such that the blank external CSS is always there ready to go. There is no reason to remove the "link" tag or delete the external stylesheet, unless you firmly decide to stay within the HTML. And when the previous version is in git, you don't even have to be firm. :)

Now to the original question: about using separate PHP files in that manner. First of all, when you're doing one of your first web apps, whatever works or even heads in the direction of working is progress. With that said, there is no need to reload the page with full-page HTTP calls in your case. Once you have basics of the page, clicking on a menu category should call AJAX JavaScript and only refresh the center of the screen. The AJAX makes the call to PHP.

With *THAT* said, then you get into the question of "single page" PHP. As much as I despise WordPress and Drupal, their notion of single page probably has some merit, although I think they take it too far, and their version gets too complex. Single page means that there is a web server (Apache) rewrite in .htaccess that routes all requests through index.php. The index then routes the requests as needed.

Then again, the single page thing may be too much for now. I still have not used it when I'm writing from scratch, but I'm considering it. I may have an update on this in 2 - 3 weeks as I make this decision in a "real world," paid project. (It's not a new project.)

entry history

I expect I'l be revising this for a while, so it needs a history.

  1. nevermind. I hope I labelled the entries well enough
  2. 2021/12/24 17:56 EST - 3nd new entry, same
  3. 2021/12/24 17:42 EST - 2nd new entry, labeled with timestamp
  4. 2021/12/24 16:38 EST - new entry, labeled as "16:38"
  5. 2021/12/24 15:53 EST - fixed Alefantis link
  6. 2021/12/23 17:51 EST - prepping for first post

2021, August 28 - Asterisk compilation revisited

This is a follow-up to previous entries.

I have limited download bandwidth at the moment (long story), and I still haven't perfected VMs and / or Docker and such locally, so in order to get a clean installation and compilation slate, I'll rent an AWS "on demand" instance. Hopefully it will cost me 10 - 20 cents. I want an x86_64 processor so that it's closer to my own machine. I might as well get a local-to-my-instance SSD / NVME (as opposed to an EBS / network drive) for speed, and I should use "compute optimized" because I will peg the CPU for a short while. So my cheapest option seems to be a c5ad.large, currently at $0.086 / hour in northern Virginia (us-east-1).

Instance details: Ubuntu 20.04 (probably will remain the same until 22.04), x86 (just to make it closer to my local machine - "x86" is implied x86_64). Type c5ad.large. I would give it 12 GB storage for the EBS / root drive rather than the default 8. 8 GB may be too little. Assuming you have a VPC (VPN) and ssh keys set up, that's all you need.

Today's greatly improved compilation commands. Notes on this below.

For current versions, one of the first steps calls for downloading "asterisk-xx-current," so be sure to check the relevant Asterisk download directory for higher versions. Note that the versions are not in any useful order, so you'll have to look carefully and / or search. The documentation still references version 14.x.y. I compiled version 18.

When everything is compiled / you're done, the directories use exactly 1 GB (call it 1.1 GB to be safe), but that may grow with future versions.

When running the step "sudo ./install_prereq install" note that the US telephone country code is 1

Note that downloading dahdi and dahdi-tools from Asterisk, as shown in their directions, will not work in recent Linux kernels (5.11 or earlier) because the Asterisk versions are behind. My instructions have you compile the source.

The compilation of dahdi, dahdi-tools, and libpri are quick. Asterisk itself almost exactly 5 minutes. From reboot, elapsed time for this day's attempt #1 was 38 minutes; second attempt was about 23 minutes. I forgot to check the final one. I believe I posted attempt #4 above.

My previous attempt at compilation instructions (days ago), just for the record.

2021, August 22 - 25 - Cardano / Ada cryptocurrency

As of several days ago, I have a Cardano "stake pool" running. It is public, but, for a number of reasons, I'm not going to advertise it, yet.

These are notes on setting up a stake pool. In short, a stake pool is the rough equivalent of a Bitcoin mining node. Bitcoin is "proof of work" (mining); Ada is "proof of stake" (user investing). Bitcoin uses an absurd amount of energy to "mine." Ada's trust is established by the community investing in stake pools. That's the very brief sketch.

The official instructions are fairly good, but, as is almost always the case, they leave a few things out, some things are clear as mud, they make assumptions, etc. These are my annotations.

hardware requirements

Because the instructions start with hardware requirements, I will, too. I seem to be doing fine with 4 GB of RAM, HOWEVER... I have a big, fat qualifier to that further below. I am running two AWS EC2 "c5ad.large" type instances--one for the relay node, and one for the block producer. For "on demand" / non-reserved, $0.086 / hour X 24 hours X 30.5 days (average month) X 2 instances (block producer and relay) = $126 per month just for the CPU. Storage fees are more; that will take a while to nail down precisely; roughly, I'd say that's another $25 / month. Note that reserving an instance--paying some in advance--cuts CPU prices in half. See "reserved instances."

I'll express drive space in two parts. The EBS Linux root ( / ) is only using 2.4 GB with an Ubuntu Linux 20.04 image; the chain database is NOT on root, though (see below). If you decide to save / log the output of the node, note that the block producer has produced 118 MB of output in about 3.7 days; 242 MB in about 7 days. I assume the relay node is much less; I'll try to check later. The block producer outputs every second because it's checking to see if it's the slot leader. The "slot leader" is the rough equivalent of winning the Bitcoin mining lottery and producing a block on the blockchain.

As for the chain database, it is currently 13 GB. Based on everything I've seen, the rate of increase of the database is likely to grow for weeks or months.

After that 3.7 days, I have only been charged 9 cents for 1 GB of output to "the internet" outside of AWS. However, billing is several hours behind. (11 cents in 7 days)

As for their assertion "that processor speed is not a significant factor for running a stake pool...." That appears to be true for the most part, but there are some exceptions, just below.

exceptions to hardware reqs

Processing the ledger ($ cardano-cli query ledger-state --mainnet > /tmp/ledger.json ) used 4CPUs (cores), took 10GB of RAM, and ran for about 5 minutes. The ledger was 3.8 GB several days ago. It compressed to 0.5 GB. Don't try running this on a stake pool node / instance unless you're sure it can handle it, and it's just not worth risking unless it can really handle it.

I ran the ledger on an AWS EC2 c5ad.2xlarge instance. I ran it for 0.8 hours X $0.344 / hour = $0.28. That's how long it took me to copy the chain / database from EBS to the local nvme (ssd), load up the Cardano binaries and basic config, sync the database between the saved chain and the current change, run the ledger, compress, and download the ledger.

Similarly, I would be careful running any queries on a live stake pool. I have reason to believe that even short queries like utxo will slow the system down enough that it may miss its slot. In other words, a stake pool node show do nothing but route and process blocks. It should only be running the node; not foreground ad hoc commands.

The instructions try to push you to compiling or Docker, but the binaries available for x86_64 Linux work just fine. The binaries are linked from the cardano-node GitHub or the "latest finished" link. I am using v1.28.0.

You'll want to put the Cardano binaries path in your Linux PATH environment variable. While you're at it, you should decide where you're going to put the Caradao socket. It can be anywhere that your standard user has access to create a file. Cardano runs as the standard user, not a sudoer or root. I won't admit where I put mine because I'm not sure it's a good idea, but I call it, abstractly, /blah/cardanosock which assumes the standard user has rwx access to /blah .

Subsituting for your own binary path and socket, add these lines to ~/.bashrc :

				export PATH="/opt/cardano:$PATH" 
				export CARDANO_NODE_SOCKET_PATH=/blah/cardanosock
			

Then don't forget to $ source ~/.bashrc for every open shell. The contents of .bashrc don't load until a new shell is open or you "source"

I had never installed or used Docker before. On one hand, I got it all running very quickly, but I haven't learned to deal with Cardano's Docker image limitations yet. It was 40 MB when running, as I remember, which is impressive, but that leaves out too many commands. I may start with an Ubuntu docker image and try to build my own Cardano Docker image at some point. Beyond a quick test, I have not used Docker.

On the config file step, I would add that you need to use the same command, for both test and main, to get testnet-alonzo-genesis.json or mainnet-alonzo-genesis.json . Use the same wget command except substitute the appropriate alonzo file.

Wherever you see --mainnet , you subtitute "--testnet-magic 1097911063" (without quotes) for the current testnet. The addresses step shows you how to create a test address such as addr_test1xyzabc123.... In the testnet, you get test Ada (tAda) to play with from the faucet. Enter an address such as the above.

Note that you won't see your funds in utxo until your node catches up with the chain. I don't remember how long that took in test: somewhere between 1 - 5 hours. The config file page above shows you how to query your tip. Note that a slot is created every 1 second, so you are comparing your progress against historical slots. You can see where the test and main chains are at the test Explorer and mainnet Explorer. An epoch is 5 days, although I don't know if that ever changed. We are currently in the "Mary" era; I am still not sure if that's the same as the "Shelley" era.

For reference on the slot currents times, from which you can calculate the origin:

slot (elapsed seconds)as of
mainnet383825162021/08/26 03:33:27 UTC
testnet355797092021/08/26 03:35:25 UTC

On a similar point, if you are still syncing, when calculating your "--invalid-hereafter" make sure to calculate against the live chain, not your tip. Otherwise, your transaction will be immediately invalid (or will have been invalid a year ago).

The mainnet takes something like 27 - 35 hours to sync. Given that the chain is a linear chain, only 1 CPU / core can be used. Note that the density of data goes way up in the last several months, so you'll plow through historical seconds / slots, and then it takes much longer to process the last few months.

I never got the mainnet loaded on my own computer. For one, the fan ran like I've never, ever heard it before. One day I will likely sync my computer by downloading the chain (see more below).

Regarding utxo and transactions, it wasn't until one of the final steps that I was confronted with a situation where the payment of the 500 Ada stake pool deposit had come in 4 transactions, which is 4 utxos. I had to use 3 utxos to get enough Ada. Below I am shortening and making up utxo addresses / ids:

cardano-cli transaction build-raw \
--tx-in abcde#0 \
--tx-in abcdf#0 \
--tx-in abcdg#0 \
--tx-out $(cat payment.addr)+0 \
--invalid-hereafter 0 \
--fee 0 \
--out-file tx.draft \
--certificate-file pool-registration.cert \
--certificate-file delegation.cert
			

That command comes from the stake pool registration page.

Also when building transactions, keep track of the tx.raw and tx.draft. The draft command and raw command are similar, so it's easy to get that confused. Look at the timestamp and file order of the tx.raw and tx.draft to help keep track. If you mess this up, you'll get a "ValueNotConservedUTxO" error, mixed in with a bunch of other gibberish (partial gibberish even to me!).

Once you submit a transaction successfully, it will show up in the Explorer (see above) within seconds, perhaps 20 seconds at most. Deposits show up in the Explorer as deposits.

Regarding topology:

block producer (I changed the exact address, but it is a 10.0.x.y, which is the relay's address on the same VPC):

$ more kwynn-block-producer-topology-2021-08-1.json
{
  "Producers": [
    {
      "addr": "10.0.157.52",
      "port": 3001,
      "valency": 1
    }
  ]
}
			

Assuming the block producer is running on port 3001, the block producer firewall only needs to admit 10.0.157.52/32 for TCP

relay:

$ more kwynn-topology-relay-2021-08-1.json
{
  "Producers": [
    {
      "addr": "relays-new.cardano-mainnet.iohk.io",
      "port": 3001,
      "valency": 2
    },
    {
      "addr": "10.0.157.53",
      "port": 3001,
      "valency": 1
    }
  ]
}
			

The relay needs to admit "the world" on TCP 3001 (or whatever port it's on) because it's receiving from the world.

final stake pool steps / public pools

Using Github for storing metadata is a good idea. Note that the git.io URL shortcut will work for anything in GitHub, including repository files or specific repository versions. That is, you don't have to use a Gist. I am using a standard repo file and / or a specific version; I don't remember what I settled on. The metadata hash is public, so I saved it in the repo.

(My site is getting queried 30 times a day for testnet; I really need to de-register that thing one day, and return the utxo to the faucet.)

You have to pledge something in the "cardano-cli stake-pool registration-certificate" command, but it seems that it doesn't matter what you pledge. I would assume that the amount has to be in payment.addr, though. The pool cost must be at least the minimum cost as defined in protocol.json in "minPoolCost". pool-margin can be 0 but must be set. You do not need a "single-host-pool-relay" if you're not using one; an IP address does fine.

As far as I understand, you do not need a metadata-url or metadata-hash, but that's what defines a public pool. See below.

public pools specifically

This Cardano Docs page appears to define what a public pool is, but so far I can't get my client's ticker to list on AdaTools. I can get it to list on AdaTools and Pool.vet by pool id:

What I think of as the final stake pool registration page has this command:

cardano-cli stake-pool id --cold-verification-key-file cold.vkey --output-format "hex"

That pool ID is public--it's in the public ledger. It begins with "pool". I'll use other pools as examples, but both show by poolID: AdaTools by pool ID and Pool.vet by pool id.

Pool.vet by ticker works for my client, but AdaTools does not find it in its search.

More importantly, he can't find or pledge to his pool in his Cardano Daedalus wallet. Otherwise put, I seem to be having problems declaring his pool "public," even though pool.vet shows that the metadata hashes match. My only theory at this moment is that I created the pool during epoch 285; epoch 286 is now, and I set it to retire at the end of 286. It's possible that the wallet won't show a pool set to retire in a few days. I thought I had properly un-retired the pool, but results are uncertain after several hours. So far I haven't processed the ledger again to see if the retirement is cancelled.

entry history

I wrote much of this on August 22, 2021.

stuff to update (note to self)

2 VOIP / SIP / Asterisk / voicemail entries:

2021, August 22 - VOIP / SIP / Asterisk - voicemail working

Per my previous entry (7/26), I got voicemail working on July 31. It's taken me a while to write it up in part because I started another project that I hope to write up soon.

I wound up changing my Asterisk system to UDP, so if you're following along at home, be sure to set Amazon Chime's console to UDP. For the moment that's step 16 in the previous entry.

The outgoing voicemail message is limited to a subset of audio formats and "settings." I used an 8kHz, 32 bit sample, mono file. I'm almost certain one can use a higher sampling rate, but it will do for now. The Linux / Nautilus metadata says a 128 kbps bitrate for that file. I assume the math works out. I leave that as an exercise to the reader. My file is kwprompt3.wav placed in /var/lib/asterisk/sounds/en . You'll see the kwprompt3 without the wav necessary in extensions.conf.

The big problem I had getting voicemail working was that everything would work fine, and then Asterisk would hang up after 30 seconds. That's particularly funny because my potential client is seeking developers because none of the VOIP / voicemail providers allow a voicemail over 10 minutes. My client potentially needs several hours, or perhaps somewhat beyond that. Effectively, he needs unlimited voicemail.

The two keys that led me to a solution were setting logger.conf to give me very verbose outputs--the (7) indicates 7x verbosity. I've seen examples give 5x, so I don't know if 7x gives any more, but it works. The other key was to set "debug=yes" in pjsip.conf, shown in the same file above.

When I called the voicemail phone number and looked at /var/log/asterisk/full, I would see the SIP INVITE transmitted over and over. I don't remember which way the INVITE goes; the packets are sometimes hard to interpret. In each INVITE, I would see 2 lines that began with "Via: SIP/2.0/TCP" and "Via: SIP/2.0/UDP" The lines were next to each other. The TCP line was to an external IP address; the UDP line was to an internal IP address (10.0.x.y). The Amazon Chime system that was routing the call to me is definitely external to my AWS VPC / VPN, so this was a big hint: the INVITE exchange was not being completed because the packet wasn't going from my system to the external internet. After 30 seconds, Asterisk would issue a SIP BYE command and hang up.

It took me several hours to stumble across the solution: at least one of the entries "external_media_address" and "external_signaling_address" in pjsip.conf (see previous conf links). I set them to the external IP address (Elastic IP) of my Asterisk instance / virtual machine. Then it worked!

Given my setup, the voicemails are stored in /var/spool/asterisk/voicemail/vm-try1/1/INBOX . The same voicemail is stored in 3 formats. I assume that is the line in voicemail.conf "format = wav49|gsm|wav" That's a 1990s era raw wav format, a modern, compressed WAV format (wav49, apparently), and a gsm format. The WAV and GSM are of a similar size. Given the purpose of this project, keeping the raw wav format is probably worthwhile. Off hand, I hear very little difference, but I have not tested that hard and with very many voices / conditions.

So far my potential client left a 42 minute voice message which, as best I can tell, worked fine. (I have not exhaustively tested it, but that's another story.)

2021, July 26 - VOIP / SIP (last revised roughly 9:30pm my time)

The result of the following is that I reserved a phone number and dialed it and got literally "hello world" from my Asterisk server.

A few days later I had voicemail working. Voicemail is in my August 13 entry.

Asterisk

I answered an ad about VOIP. The key of the project was that the client needs to be able to leave more-or-less arbitrarily long voice messages. I haven't gotten to the point of just how long, but definitely well over 10 minutes. I would guess that an hour is needed. The problem they had is that they talked to 15 VOIP providers and noone went over 10 minutes.

I had a brush with a VOIP project in early 2016, and I've always wondered "What if?" I played some with the Asterisk software but couldn't make much of it. I compiled it and had it running in the barest sense, but didn't get it to do anything. Asterisk is of course free and open source.

In part because I had unfinished business from 2016, I started experimenting. Then I got obsessed and started chasing the rabbit. After about 21 hours of work spread over a week or so, I have most of the critical elements I need in two "pieces"--part in the cloud and part on my own server.

UPDATE: I greatly improved the following on August 28. I eliminated all "tail chasing." I also wrote up new notes in a new blog entry.

Here is an attempt at an edited version of my Asterisk install command history. One important note is that some of that was probably tail chasing versus:
sudo ./install_prereq
sudo ./install_prereq install

Then I changed 4 config files.

Probably more to come, but I have an apprentice live right now reading this.

AWS

In almost all cases, the AWS documentation is excellent. In this case, I chased my tail around. In the end, I got somewhat lucky. Of all the weird things, I have the darndest time finding the right AWS console. The link is for the AWS Chime product including "voice connectors." So THERE is the console link.

I have the "hello world" voice which will probably download and not play. Someday perhaps I'll make it play. It's a lovely, sexy female voice--a brilliant choice on the part of the Asterisk folk. REVISION: I got some grief over "sexy." Perhaps she's only sexy when you've spent 21 hours getting to that point.

I just confirmed that the Chime console does not save in your "recently used" like everything else does. So I'm glad I recorded the link.

At the Chime console, you'll need the 32 bit IP (IPv4) address of your VOIP server, or domain name. With only a bit of trying and study, I could not get 128 bit IP addresses (IPv6) to work--they were considered invalid.

  1. At the Chime console, go to "Phone number management," then "Orders," then "Provision phone numbers."
  2. Choose a "Voice Connector" phone number. (I am using SIP, but don't chose that option.)
  3. Choose local or toll free, then pick a city, state, or area code. Pick a number or numbers and "provision."
  4. After "provision" / ordering, it may take roughly 10 seconds to show up in the "Inventory" tab. You can use the table-specific refresh icon to keep checking (no need to refresh the whole page)
  5. Go to "Voice connectors" and "Create a new voice connector"
  6. The name is arbitrary but I believe there are type-of-character restrictions
  7. You'll want the same AWS region as the VOIP / SIP server.
  8. I have not tried encryption yet, so I disable it. (One step at a time.)
  9. "Create"
  10. click on the newly created connector
  11. Go to the "origination" tab
  12. Set the "Origination status" to Enabled
  13. Click a "New" "Inbound route"
  14. Enter the IP address or domain of the Asterisk "Host"
  15. the port is 5060 by default
  16. protocol is whatever you set the VOIP server to. I used TCP for a test only because it's more definitive to tell if it's listening
  17. set priority and weight to 1 for now. It's irrelevant until you have multiple routes.
  18. Add
  19. Save (This addition step trips me up.)
  20. Go to the "phone numbers" tab and "assign from invenstory." Select your phone number and "assign..."
  21. Set /etc/asterisk/extensions.conf to the phone number you reserved (see my conf examples above)
  22. Restart Asterisk if you changed the number. There is a way to do it without restart.
  23. make sure Asterisk is running - I find it best to turn it off at the systemctl level and simply run "sudo asterisk -cvvvvvvv" Leave the Asterisk prompt sitting open so you can see what happens
  24. open up port 5060 at the AWS "security group" level for that instance
  25. Dial the number and listen to "Hello world!"

2021, July 8 - zombie killing

I can now add zombie killing to my resume. I logged into this website roughly 30 minutes ago and was greeted with the "motd / message of the day" message that there were 75 zombie processes. I barely knew what a zombie is.

First I had to find out how to ID a zombie. The answer is ps -elf | grep Z    My new "simptime" / simple time server was causing the problem.

It didn't take long to more or less figure out what a zombie is, but it took just slightly longer to find what to do about it. When a process forks, the parent is supposed to be fully attentive waiting to receive the exit / return value of the child, or it is supposed to make itself available (signal handler) to receive the value. If the parent is sleeping or waiting for something else, the parent never reads the return, and the child's entry stays in the process table. The child is dead and not using any other resources, but one potential problem is that the process table fills up. Another problem is that the ps command (depending on switches) shows a bunch of "defunct" entries. (Similarly, there may be more entires in /proc/).

A Geeks for Geeks zombie article explained how to stop the zombies; I chose the SIG_IGN option which tells the OS that the parent doesn't care what the exit value is, so the child's process entry is removed. I don't care because, for one, I have other ways of testing whether the system is working. For another, the parent can't "wait()" in my case because its job is to immediately start listening for more connections. Another option is a signal handler, but there is almost no benefit to the parent knowing the value in my case. Again, I have other ways of testing whether everything is working.

2021, July 5 - yet another round with a blasted CMS

I have encoded below my software dev rule #4 about being careful of CMSs. I got burned again last night--Happy July 4 to me! I am building an Ubuntu 21.04 environment from scratch as opposed to upgrading. There are several reasons, but I suppose that is another story. Anyhow, I was trying to get Drupal 7 to run in the new environment. Upon a login attempt, I kept getting a 403 error and "Access denied" and "You are not authorized to access this page" even though I was definitely using the right password.

To back up, first I was getting "PHP Fatal error: Uncaught Error: Undefined class constant 'MYSQL_ATTR_USE_BUFFERED_QUERY' in /.../includes/database/mysql/database.inc" Thankfully I remembered that it's Drupal's crappy way of saying "Hey, you don't have php-mysql installed," so sudo apt install php-mysql Note that you have to restart Apache, too.

Similarly, Drupal's crappy way of saying "Hey, you don't have Apache rewrite installed" was a much more tangled path. I foolishly went digging in the code with the NetBeans debugger. This is a case of "When you're not in the relevant parts of Africa, and you see hoof prints, think horses, not zebras." I assumed a problem with Drupal rather than the obvious notion that something wasn't set up right.

I eventually got to code that made it clear that the login was not being processed at all. By looking at the conditions, I eventually realized that Drupal wasn't receiving the login or password. Then I realized that none of $_REQUEST, $_POST, or $_GET were showing the login and password. So I searched on that problem and quickly realized that it was a rewrite / redirect problem.
sudo a2enmod rewrite
sudo systemctl restart apache2

Problem solved! I won't admit after how long.

I was inspired to write some code for the "Never again!" category (a more legitimate use of the phrase than some, I might add).

2021, March 4 - 5 - Robo3T copy

The makers of Robo3T have started asking for name and email when you download. R3T is of course free and open source (software - FOSS), as is almost everything I use. I got the latest version directly from them, but I thought I'd provide it for others. Providing it for others is part of the point of FOSS.

Download - robo3t-1.4.3-linux-x86_64-48f7dfd.tar.gz

SHA256(robo3t-1.4.3-linux-x86_64-48f7dfd.tar.gz)= a47e2afceddbab8e59667facff5da249c77459b7e470b8cae0c05d5423172b4d
Robo 3T 1.4.3 - released approximately 2021/02/25	

I'm messing with this entry as of the 5th at 12:08am my time. I first posted it several minutes ago.

2021, Jan 31 - yet more on time measurement and sync

I'll go back a year and try to explain the most recent manifestions of my time-measuring obsession. I wasn't so much interested in keeping my computer's time super-accurate as I was interested in how to compare it with "official" time. Otherwise put, how do I query a time server? The usual way turned out to be somewhat difficult. (It just occurred to me a year later that perhaps NTP servers don't check the incoming / client time info. Or perhaps they do. In any event...) The usual way is first demonstrated in my SNTP (simple NTP) web project (GitHub, live).

During those explorations, I found the chrony implementation of the network time protocol (NTP). This both keeps "super" accurate time, depending on conditions, and it tells you how your machine compares to "official" time. That kept me happy for a while, but then I started wondering about the numbers chrony gives me.

So I updated the web SNTP code and made a command line (CLI command line interface) version. (Note that in that case I'm linking to a specific version because that code will likely move soon.) In good conditions, that matches chrony's time estimate well enough. Good conditions are AT&T U-Verse DSL at a mere 14 Mbps download speed accessed through wifi with 60 - 80% signal strength. Both U-Verse and my wifi signal are very, very stable. (I think it's still called DSL, even after ~22+ years. It involves something that looks like a plain old telephone line, although I can't be sure it's the same local wireing as 40 years ago.)

I can use the "chronyc tracking" command to get my time estimate, or I wrote a tabular form of it.

Below are my chrony readings as of moments ago (5:40pm my time). I'm removing some less-relevant rows.

/chronyc$ php ch.php
 mago    uso    rdi      rf    sk   rde      f
145.3     +0  50.91    -0.18  13.1   65   -7.794 
 96.3   +719   1.40    40.71  13.1   36   -7.794 
 95.2    -56   0.97    -0.20  10.5   37   -1.487 
 89.3    -63   1.59    -0.05   1.9   36   -5.476 
  1.8    +10   1.06    -0.00   0.3   36   -7.450 

Weeks later... I'm going to let this post die right here, at least for now. I hadn't posted this as of March 3.

2021, Jan 29 - chrony continued

As a follow up to my previous entry, now I've set minpoll / maxpoll to 1 / 2 with my cellular network. THAT gets results. My offset time approaches that of a wired connection, and it's the same with root disperson and skew.

2021, Jan 28 - chrony on wired versus wireless

chrony is a Network Time Protocol (NTP) client / server; in other words, it helps computers keep accurate time by communicating time "readings" over the internet.

In the last few weeks I have set chrony to use kwynn.com as its time source. Kwynn.com lives on Amazon Web Services (AWS). AWS has a time service, and my "us-east" AWS region is physically close to the NIST time servers in Maryland. Right now I have a root disperson and root delay of around 0.3ms, and my root mean square offset from perfect time is 13 microseconds (us or µs). I have 3 - 5 decimal places after that, but I won't bore you any more than I already am. The point being that it's probably just as good or better than using the NIST servers.

I've tested kwynn.com versus using it plus other servers in the Ubuntu NTP pool, and kwynn.com is much, much better. This is one of several stats that I may quantify one day, but I want to get the key point out because I found it interesting and want to record it for myself as much as anything.

Among other features, chrony has the "chronyc tracking" command that gives you an estimate of your clock's accuracy and various statistics around that estimate. Then I check check chronyc against I script I wrote that polls other servers and outputs the delay, including an arbitrary number of polls of kwynn.com. Sometimes I'll query kwynn.com 50 times, seeking the fastest turnaround times which in theory should be the best. I call this my "burst" script.

On AT&T UVerse (I think that's still "DSL.") at what is probably the slowest available speed (14 Mbps / 1.4 MBbs), chrony is very stable. What chrony says versus "the burst" is very close.

On my T-Mobile (MetroPCS) hotspot, things get more interesting. Sometimes when I cut over from AT&T to wireless, my time gets pretty bad and the chronyc readings are very unstable. This evening it was so bad that I changed my minpoll / maxpoll to 2 / 4. (Depending on my OCD and my mood, I tend to have it on 4 - 5 / 6 - 7.) Note that you should not use such numbers or even close with the NTP poll, and you may or may not get away with it using NIST--please check the fine print.

When I set min / max to 2 / 4, that's when things got interesting. On one hand, the chronyc numbers stabilize to the point that they get close to wired numbers. On the other hand, comparison to "the burst" is not nearly as "convincing" / close as wired. That is, chrony claims accuracy in a range of 100 - 300 us, but it's hard to get a "burst" to show 3 - 4 ms. The burst almost never shows time as good as chrony claims, but that's another discussion.

Otherwise put, with a low poll rate on wireless, chronyc claims to be happy and shows good numbers, but agreement with the burst is not nearly as close.

This is mostly meant as food for thought, and perhaps I'll give lots of gory details later. I mainly wanted to record those 2 / 4 numbers, but I thought I'd give some context, too.

2021, Jan 23 - detecting sleep / hibernate / suspend / wakeup in Ubuntu 20.04

In Ubuntu 20.04 (Focal Fossa), executables (including scripts with the x bit set) placed in /lib/systemd/system-sleep/ will run upon sleep / hibernate / suspend and wakeup. This is probably true of other Debian systems. I mention this because for some distros it's /usr/lib/systemd/system-sleep/

One indicator I had is that the directory itself already existed and 2 files already existed in it: hdparm and unattended-upgrades. There are some comments out there that /lib/... is correct for some Debian systems, but I thought this was worth writing to confirm.

example script

/lib/systemd/system-sleep$ sudo cat kw1.sh
#!/bin/bash 
echo $@ >> /tmp/sleeplog
whoami  >> /tmp/sleeplog
date    >> /tmp/sleeplog
	

The bits:

/lib/systemd/system-sleep$ ls -l kw1.sh
-rwxrwx--- 1 root root 158 Jan 23 18:18 kw1.sh
	

output:

$ cat /tmp/sleeplog
pre suspend
root
Sat 23 Jan 2021 01:39:49 AM EST
post suspend
root
Sat 23 Jan 2021 06:08:02 PM EST
	

The very careful reader will note that the script above is less than 158 bytes. I added a version number and a '******' delimeter after the first version. I'm showing just the basics, in other words, and I'm showing the parts that I know work.

2020, Nov 20 - arbitrary files played as "music"

As part of my now-successful quest for randomness from the microphone, I came across non-randomness from a surprising place. I generated the following audio file with these steps:

dd if=~/Downloads/ubuntu-20.04.1-desktop-amd64.iso of=/tmp/rd/raw.wav bs=2M count=1
ffmpeg -f u8 -ar 8k -ac 1 -i /tmp/rd/raw.wav -b:a 8k /tmp/rd/ubulong.wav
ffmpeg -t 1:35 -i /tmp/rd/ubulong.wav /tmp/rd/ubu95s.wav
chmod 400 /tmp/rd/ubu95s.wav
mv /tmp/rd/ubu95s.wav /tmp/rd/ubuntu-20-04-1-desk-x64-95-seconds.wav

Turn your speakers down! to about 1/4 or 1/3 of full volume. I now present Ubuntu Symphony #1 - opus 20.04.1.1. There is a bit of noise for less than 2 seconds, then about 3 seconds of silence, and then nearly continuous sound.

I posted several versions quickly; the final version was posted at 6:27pm on posting day.

I'm adding some discussion a year later.

2020, Oct 15 - SEO

In the last few weeks I finally took a number of SEO steps for this site. I'd been neglecting that for years. I registered the httpS version of kwynn.com with Google, and I created a new sitemap with a handful of httpS links.

A few weeks after the above, I got some surprising Google Search Console results. I have 247 impressions over 3 months for my PACER page. I only have 6 clicks, and I suspect that's because the page's Google Search thumbnail / summary / whatever shows an update date of November, 2017, which is incorrect. Soon I am going to attempt to improve that click through rate.

limitations of RAM, speed, etc. 2020, Oct 7 - entry 2 of the day

My only active apprentice just bought an ArduinoBoy in part because he is fascinated to wrestle with 1980-era limitations of RAM and such. As I discussed with him, I am not disuading him from that. However, I wanted to give him something to think about.

Last night I managed to crash several processes and briefly locked up my session because I didn't consider that there are still limitations on relatively modern hardware. It's much harder to do that much (temporary) damage today than it was in 1995 or 2003, but it's still possible.

Generally speaking, I was testing something that involved all cores at once and as many iterations as I could get. I got away with 12 cores times 2M iterations (24M data points total). Then I ran that again without wiping my ramdisk (ramfs), so I was able to test 48M data points. Then when I tried to run 12 X 8M = 96M, my system went wonky.

I have not done a post-mortem or simple calculations to know what specifically went wrong. I probably exceeded the RAM limitation set in php.ini. I may have exceeded system RAM, but I don't think so. What is odd is that my browser crashed, and it was just sitting there innocently. It was not involved in the wayward code. All the CPUs / cores were pegged for a number of seconds, but that shouldn't have that effect.

Maybe he'll want to figure out what went wrong and how to most efficiently accomplish my testing?

On a related point, one thing I learned is that file_put_contents() outputing one line at a time simultaneously from 12 cores does not work well, which makes perfect sense with a few moments of thought. So I saved the data in a variable until the "CPU stuff" was done and then wrote one file per process. (fopen and fwrite were not notably faster in that case.)

So how do I accomplish the testing I want with as many data points as possible, as fast as possible, without crashing my session (or close enough to crashing it)? The question of limitations applies on a modern scale.

Apparently the current version of the code is still set for 96M rows. The October 3 entry of my GitHub guide explains what I was doing to a degree. I'll hopefully update that page again sometime this week, and try to explain it better.

I also observed several weeks ago that forking processes in an infinitely loop will very thoroughly crash the (boot) session to the point of having to hold down the start button. Up until very roughly 2003, when I was still using Satan's Operating System, any infinite loop would crash the session. Now a client-side JS infinite loop will simply be shut down by the browser, and similarly contained in other situations. But infinitely forking processes on modern Ubuntu will get you into trouble. I suppose that's an argument for both a VM and imposing quotas. I took the quota route.

As best I remember, the code in question was around this point (AWS EC2 / CPU metrics process control).

new rules of software dev - numbers 3 and 4 - 2020, Oct 7 entry 1 of the day

The first two rules are at the beginning of this blog.

Kwynn's rule of software dev #3:

Never let anyone--neither the client nor other devs--tell you how to do something. The client almost by definition tells you what he wants done, not how.

This applies mainly for freelancing, or perhaps one should freelance in order to not violate the rule.

I should have formulated this in 2016 or 2017. I finally had one last incident in the summer of 2020 that caused me to formalize it, and now I'm writing it out several weeks later.

To elaborate on the rule, if you know all the steps necessary to do something in a certain way, do it. After it's done your way, no one is likely to argue with you. If you try to do it someone else's way, you are likely to waste a lot of time and money.

An example is beware of when the client requests that you do the quick fix. If your way is certain and the quick fix is uncertain, by the time you do the quick fix, you would have both fixed the problem and had a better code base by doing it your way.

Another statement of the rule is to beware of assuming that others know more than you do. Specifically beware of those who you may think are developers but are actually developer managers or salespeople with delusions of developing. I once knew a developer manager who exemplified the notion "He knows just enough to be dangerous." He led me into danger.

Kwynn's rule of software dev #4:

Custom-written software is often the best long-term solution. Be very careful of content management systems, ERP systems, e-commerce systems, etc.

To quote a comedian from many decades ago, "I went to the general store, but I couldn't buy anything specific." That reminds me of WordPress, Drupal, OpenERP (I doubt Odoo is any better.), etc. There is plenty more to say on this, but it will have to wait.

July 18, 2020

Some words on JavaScript var, let, const. I'll admit to still being fuzzy on some fine points, but here are some rules of thumb I've come up with that are well battle tested:

June 21, 2019

Over the last several weeks, I ran into 5 - 6 very thorny problems. Let's see if I can count them. About all I'm good for at this moment is writing gripy blog posts, if that.

My June 12 entry refers you to the drag and drop problem and the hard refresh problem. Those are 2 of the problems.

I just wrote an article on networking bridging and using MITM (man in the middle) "attacks" / monitoring. Getting both of those to work was a pain. The bridging took forever because the routing table kept getting messed up. The MITM took forever because it took me a lot of searching to find the necessity for the ebtables commands.

After I solved the Firefox problems mentioned on June 12, I ran into another one. The whole point of my "exercise" for calendar months (weeks of billable time) was to rewrite the lawyer ERP timecards such that they loaded many times faster. They were taking 8 seconds to load, and *I* did not write that code.

Load time was instant on my machine. Everything was good until I uploaded the timecard to the Amazon nano-instance. Then the timecards took 30 - 45 seconds to load. The CPU was pegged that whole time. So, I'm thinking, my personal dev machine is relatively fast. The nano instance is, well, nano. So, I figured, "More cowbell!". At a micro-instance, RAM goes from 0.5 GB to 1GB. That appeared to be enough to keep the swap space usage to near zero. No help. Small--nope: no noticable change. At medium, CPUs go from 1 to 2. Still no change. I got up to the one that costs ~33 cents an hour--one of the 2xlarge models with 8 CPUs. Still no change. WTF!?!

I had started to consider the next generation of machines with NVMe (PCI SSDs). My dev machine has NVMe, so maybe that's part of the problem. However, iotop didn't show any thrashing. It was purely a CPU problem.

So, upon further thought, it was time to go to the MySQL ("general") query log. The timecard load was so slow that I figured I might see the query hang in real time. Boy, did I ever! I found one query that was solely responsible. It took 0.13s on my machine and 46s on an AWS nano (and much more powerful). That's 354x.

The good news was that I wrote the query, so I should be able to fix it, and it wasn't embedded hopelessly in 50 layers of Drupal feces. (I did not choose Drupal. I sometimes wish I had either passed on the project or seized power very early in my involvement. My ranting on CMSs will come one day.)

I thought I isolated which join was causing trouble by taking query elements in and out. I tried some indexes. Then I looked at the explain plan. It's been a long time since I've looked at an explain plan, but I didn't see anything wrong.

My immediate solution was to take out the sub-feature that needed the query. That's fine with my client for another week or two. Upon yet more thought, I should be able to solve this easily by using my tables rather than Drupal tables. I've written lots of my own tables to avoid Drupal feces. It turns out that using my tables is a slightly more accurate solution to the problem anyhow.

One of the major benefits of using AWS is that my dev machine and the live instance are very close to identical in terms of OS version, application versions, etc. So this is an interesting example of an exponential effect--change the performance characteristics of the hardware just a bit, and your query might go over the cliff.

I guess it's only 5 problems. It seemed like more.

June 12, 2019 - a week in the life

I created a new page on some of my recent frustrations--frustrations more than achievements. We'll call it "a week in the life." I thought browser differences were so 2000s or 200ns (2000 - 2009).

March 9, 2018 - upgrading MongoDB in Ubuntu 17.10

This started with the following error in mongodump:

Failed: error dumping metadata: error converting index (<nil>): conversion of BSON value '2' of type 'bson.Decimal128' not supported

Here is my long-winded solution.

March 8, 2018 - anti-Objectivist web applications

I was just sending a message on a not-to-be-named website, and I discovered that it was eliminating the prefix "object" as in "objective" and "objection." It turned those words into "ive" and "ion." Of course, it did it on the server side, silently, such that I only noticed it when I read my already-sent message. The good news is that the system let me change my message even though it's already sent. I changed the words to "tangible" and "concern."

I have been teaching my apprentice about SQL injection and what I call the "Irish test": Does your database accept "O'Reilly" and other Irish names? This is also a very partial indication that you are preventing SQL injection. Coincidentally, I emailed a version of this entry to someone with such an Irish name. So far, sending him email hasn't crashed GMail. They probably use Mongo, though.

If you haven't guessed, what's happening in this case is eliminating "object" because it might be some sort of relative to SQL injection. I thought I've seen evidence that the site is written in PHP, but, now that I look again, I'm not as sure. This is knowable, but I don't care that much. I don't think "object" is a keyword in either PHP or JavaScript. (Yes, I suppose I should know that, too, but what If I chased down every little piece of trivia?!) In any event, someone obviously got a bit overzealous, no matter what the language.

I will once again posit to my apprentice that I don't make this stuff up.

The final word on SQL injection is, of course, this XKCD comic. I must always warn that I am diametrically opposed to some things Munroe has said in his comic. I would hope he goes in the category of a public figure, and thus I can call him an idiot-savant. Then again, he more or less calls himself that about every 3rd comic. He's obviously a genius in many ways, but he epically misses some stuff. One day, this tech blog might go way beyond tech, but I'm just not quite there yet, so I'm not going to start exhaustively fussing at Randall.

Mar 1, 2018 - LetsEncrypt / certbot renewal

This is the command for renewing an SSL cert "early":

sudo certbot renew --renew-by-default

Without the --renew-by-default flag, I can't seem to quickly figure out what it considers "due for renewal." Without the flag, you'll get this:

The following certs are not due for renewal yet:
  /etc/letsencrypt/live/[domain name]/fullchain.pem (skipped)
No renewals were attempted.

I should have the rate limits / usage quotas under "rate limits."

An update, moments after I posted this: the 3 week renewal emails are for the "staging" / practice / sandbox certs, not the live / real ones. I wonder when or if I'd get the live email? Also, I won't create staging certs again, so those won't help remind me of the live renewals again. I'll put it on my calendar--I'm not relying on an email--but still somewhat odd.

The email goes to your address in your /etc/letsencrypt/.../regr.json file, NOT the Apache config. I say ... because the path varies so much. grep -iR [addr] will find it.

Feb 2, 2018 - base62

Random base64 characters for passwords and such annoy me because + and / will often break a "word"--it's hard to copy and paste the string, depending on the context. Thus, I present base62: the base64 characters minus + and /. I considered commentary, but perhaps I'll leave that as the infamous "exercise to the reader." However, I do have a smidgen of commentary below.

Note, as of 2022/01/05, I am replacing the less-than sign of the php tag with an HTML less-than entity, because the real PHP tag disrupts the NetBeans editor. The current version if this code is now in GitHub.

Example

Assuming you call the file base62.php, give it exe permission, and execute from the Linux command prompt:

./base62.php 50
vjQBjFxJGcotOpxVJyvG1CUQ11010xigP1RyuKza120JWeFkeI

Validation

./base62.php 1000 | grep -P [ANZanz059]

That's my validation that I see the start, end, and midpoints of my 3 sets (arrays) of characters.

UQID

In the event that Google doesn't look inside the textarea, UQID: VMbAlZQ13ojI. That was generated with my brand new scriptlet. So far that string is not indexed by Google. UQID as in unique ID. Or temporarily globally unique ID. Or currently Google unique ID (GOUID?). Presumably it isn't big enough to be unique forever. 62^12 = 3 X 10^21. That's big but not astronomical. :)

somewhat-to-irrelevant commentary

What can I say? Sometimes I amuse myself. Ok. My structure is on the obtuse side. I couldn't help it. I usually don't write stuff like that. Perhaps Mr. 4.6 or one of my more recent contacts can write the clearer version. I actually did write clearer versions, but, then, I couldn't help myself.

further exercise to the reader

Perhaps someone will turn this into a web app? Complete with nice input tags and HTML5 increase and decrease integer arrows and an option to force SSL / TLS and AJAX.

installing

sudo cp base62.php /usr/bin
cd /usr/bin
ln -s ./base62.php base62
cd /tmp
base62
[output =] RyH3HjGnEalr71meSJfm

Now it's part of my system. I changed to /tmp to make sure that . PATH wasn't an issue--that it was really installed.

Reference

Jan 28, 2018 - Stratego / Probe

I'd like to recommend Imersatz' Stratego board game implementation called Probe. It is the 3 time AI Stratego champion. The AI plays against you. It's a free download; see the "Download" link on that page. From a human who is good at the game's point of view, I would call it quasi-intelligent, but it beats me maybe 1 / 7 times, so it's entertaining.

I am running the game through WINE, the Windows Emulator for Linux. I just downloaded it to make sure it matches what I downloaded to this new-to-me computer months ago. It does. Below I give various specs. Those are to make sure you have the same thing I do. It hasn't eaten my computer or done anything bad. I have no reason to think it's anything but what it says it is. In other words, I am recommending it as non-malware and fun. If it makes you feel any better, you can see this page in secure HTTP.

Probe2300.exe [the download file]
19007955 bytes
or 19,007,955 bytes / ca. 19MB
SHA512(Probe2300.exe)= e96f5ee67653eee1677eb392c49d2f295806860ff871f00fb3b0989894e30474119d462c25b3ac310458cec6f0c551304dd2aa2428d89f314b1b19a2a4fecf82
SHA256(Probe2300.exe)= ee632bcd2fcfc2c2d3a4f568d06499f5903d9cc03ef511f3755c6b5f8454c709

The above is the download file from Imersatz. In the probe exe directory, I get:

1860608 [bytes] Feb 28  2013 Probe.exe
 800611         Feb 28  2013 Probe.chm
1291264         Feb 28  2013 ProbeAI.dll

SHA256(ProbeAI.dll)= 13e862846c4f905d3d90bb07b17b63c915224f5a8c1284ce5534bffcf979537a
SHA256(Probe.chm)= 3b7be4e7933eee5d740e748a63ea0b0216e42c74a454337affc4128a4461ea6b
SHA256(Probe.exe)= 656f31d546406760cb466fcb3760957367e234e2e98e76c30482a2bbb72b0232

Jan 14, 2018 - grudgingly dealing with Mac (wifi installation)

The first time Mr. 4.6 installed Ubuntu Linux (17.10 - Artful Aardvark) on his Mac laptop (MacBook Pro?), wifi worked fine "out of the box." I think that's because he was installing Linux via wifi. This time, he used ethernet, and wifi wasn't recognized--no icon, no sign of a driver. Because he was using ethernet, maybe the installer didn't look for wifi? Maybe he didn't "install 3rd party tools"? (I asked him about that, but he was busy being excited that we fixed it. I'll try to remember to ask again.) There were good suggestions on how to fix it out there, but I derived the simplest one:

sudo apt-get install bcmwl-kernel-source

He didn't even have to reboot. His wifi icon just appeared.

For the record, that's "Broadcom 802.11 [wifi] Linux STA wireless driver source."

Thanks to Christopher Berner who got me very close. He was suggesting a series of Debian packages, but the above command installed everything in one swoop.

There are a few questions I have for 4.6 about this. Hopefully I'll get answers tomorrow or later.

Jan 3, 2018

JavaScript drag and drop

I created a JavaScript drag and drop example. I may have done it in JQuery a handful of times, but I don't remember for sure. This is a "raw" JS version--no JQuery or other libraries. I've been thinking about writing a to do list organizer which would use drag and drop. Also, I might use it professionally soon.

new-to-HTML5 semantic elements / tags

Last night, my apprentice Mr. 4.6 showed me these new HTML5 elements / tags. I remember years ago looking for a list of everything that is new in HTML5. I suspect I've at least heard of 75% of it from searching on various stuff, but I did not know about some of those tags. I would hope there is good list by now. Maybe I'll look again or 4.6 will find one.

Dec 24, 2017 - remote MongoDB connections through Robo 3T / ssh port forwarding

A new trick to my Linux book:

ssh -L 27019:127.0.0.1:27017 ubuntu@kwynn.com -i ./*.pem

That forwards local port 27019 to kwynn.com's 27017 (MongoDB), but from kwynn.com's perspective 27017 is a local port (127.0.0.1 / localhost). Thus, I can connect through Robo 3T ("the hard way" / see below) to MongoDB on Kwynn.com without opening up 27017 to the world. In Robo 3T I just treat it like a local connection except 27019. (There is nothing special about 27019. Make it what you want. Thanks to Gökhan Şimşek who gave me this idea / solution / technique in this comment. )

I used this because I am suffering from a variant of the ssh tunneling bug in 3T 1.1. (I solved it. See below.) I think I have a different problem than most report, though. Most people seem to have a problem with encryption. I'm not having that problem because this is what tail -f /var/log/auth.log shows:


I suspect the Deprecated stuff is irrelevant:

Dec 24 00:11:11 kwynn.com sshd[18675]: rexec line 16: Deprecated option UsePrivilegeSeparation
Dec 24 00:11:11 kwynn.com sshd[18675]: rexec line 19: Deprecated option KeyRegenerationInterval
Dec 24 00:11:11 kwynn.com sshd[18675]: rexec line 20: Deprecated option ServerKeyBits
Dec 24 00:11:11 kwynn.com sshd[18675]: rexec line 31: Deprecated option RSAAuthentication
Dec 24 00:11:11 kwynn.com sshd[18675]: rexec line 38: Deprecated option RhostsRSAAuthentication
Dec 24 00:11:12 kwynn.com sshd[18675]: reprocess config line 31: Deprecated option RSAAuthentication
Dec 24 00:11:12 kwynn.com sshd[18675]: reprocess config line 38: Deprecated option RhostsRSAAuthentication
[end deprecated]

Dec 24 00:11:12 kwynn.com sshd[18675]: Accepted publickey for ubuntu from [my local IP address] port 50448 ssh2: RSA SHA256:[30-40 base64 characters]
Dec 24 00:11:12 kwynn.com sshd[18675]: pam_unix(sshd:session): session opened for user ubuntu by (uid=0)
Dec 24 00:11:12 kwynn.com systemd-logind[960]: New session 284 of user ubuntu.
Dec 24 00:11:12 kwynn.com sshd[18729]: error: connect_to kwynn.com port 27017: failed.
Dec 24 00:11:12 kwynn.com sshd[18729]: Received disconnect from [my local IP address] port 50448:11: Client disconnecting normally
Dec 24 00:11:12 kwynn.com sshd[18729]: Disconnected from user ubuntu [my local IP address] port 50448
Dec 24 00:11:12 kwynn.com sshd[18675]: pam_unix(sshd:session): session closed for user ubuntu
Dec 24 00:11:12 kwynn.com systemd-logind[960]: Removed session 284.

For the record, the error I get is "Cannot establish SSH tunnel (kwynn.com:22). / Error: Resource temporarily unavailable. Failed to create SSH channel. (Error #11)."

This doesn't seem to be an encryption problem, though, because my request is clearly accepted. MongoDB is bonded to 127.0.0.1--internal connections only--but this shouldn't be a problem because based on traceroute my system knows that IT is kwynn.com (It "knows" this in /etc/hosts). It doesn't try routing packets outside the machine.

On the other hand, this won't work in the sense that 3T won't connect:

ssh -L 27019:kwynn.com:27017 ubuntu@kwynn.com -i ./*.pem

Solution

Huh. I just fixed my problem. If I put kwynn.com in /etc/hosts as 127.0.1.1 then 3T won't work through "manual" ssh forwarding (like my command above), even if I forward as 127.0.1.1. If I put kwynn.com in /etc/hosts as 127.0.0.1, 3T works 3 ways: either through the above (127.0.0.1) OR this:

ssh -L 27019:kwynn.com:27017 ubuntu@kwynn.com -i ./*.pem

AND 3T works without my "manual," command ssh port forwarding, through it's own ssh tunnel feature, which solves my original problem. However, I'm glad I learned about ssh port forwarding.

I need to figure out what the difference is between 127.0.1.1 and 0.1. AWS puts the original "name" of the computer in /etc/hosts as 127.0.1.1 by default, and I just read instructions to use 127.0.1.1. Oh well, for another time...

December 21, 2017 - kwynn.com has its first SSL cert, Mongo continued

I'm starting to write around 11:08pm. I'll probably post this to test the link just below, then I should write more.

SSL

Kwynn.com has its first SSL certificate. You can now read this entry or anything else on my site through TLS / SSL. I have not forced SSL, though: there's no automatic redirect or rewrite.

I remember years ago (2007 - 2009??), a group was trying to create a free-as-in-speech-and-beer certificate authority (CA). Now it's done, I've used it, and it's pretty dang cool. Here are some quick tips:

my ssl.conf

Rather than letting certbot mess with your .conf, it should look something like the following. Once the 3 /etc/letsencrypt files have populated with certbot ... certonly, then you're safe to restart Apache.

I included ErrorLog and CustomLog commands to make sure SSL traffic went to the same place as non-SSL traffic.

<VirtualHost *:443>

	ServerName kwynn.com
	ServerAdmin myemail@example.com

	DocumentRoot /blah
	<Directory /blah>
		Require ssl
	</Directory>

ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined

SSLEngine  on
Include /etc/letsencrypt/options-ssl-apache.conf
SSLCertificateFile /etc/letsencrypt/live/kwynn.com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/kwynn.com/privkey.pem
</VirtualHost>

That does NOT force a user to use SSL. "Require" only applies to 443, not 80. If you want to selectively force SSL in PHP (before using cookies, for example), do something like this:

    if (!$_SERVER['HTTPS'] || $_SERVER['HTTPS'] !== 'on') {
		header('Location: https://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI']);
		exit(0);
    }

As a critique of the above, perhaps the first term should be (!isset($_SERVER['HTTPS']) but what I have above gets rid of the warning in the Apache error log. I'll try to remember to test this and fix it later.

MongoDB continued -- partial SSL

I started to secure MongoDB with SSL / TLS, but then I noticed the Robo 3T option to use an SSH tunnel. Since one accesses AWS EC2 through an ssh tunnel anyhow, and I want access only for me, there is no need to open MongoDB to the internet. I'd already learned a few things, though, so I'll share them. Note that this is not fully secured because I had not used Let's Encrypt or any other CA yet, and I'm skipping other checks as you'll see. I was just trying to get the minimum to work before I realized I didn't need to continue down this path. See Configure mongod and mongos for TLS/SSL.

cd /etc/ssl/
openssl req -newkey rsa:8096 -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem


Then set up the config file as such:

cat /etc/mongodb.conf

storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true

systemLog:
  logAppend: true

net:
  bindIp: 127.0.0.1
  port:   27017
  ssl:
    mode: requireSSL
    PEMKeyFile: /etc/ssl/mongodb.pem

******
Then the NOT-fully-secure PHP part:

<?php
set_include_path('/opt/composer');
require_once('vendor/autoload.php');

$ctx = stream_context_create(array(
	"ssl" => array(
	    "allow_self_signed" => true,
	    "verify_peer"       => false,
	    "verify_peer_name"  => false,
	    "verify_expiry"     => false
	)
    )
);

$client = new MongoDB\Client("mongodb://localhost:27017", 
				array("ssl" => true), 
				array("context" => $ctx)
		);

$dat = new stdClass();
$dat->version = '2017/12/21 11:01pm EST (GMT -5) America/New_York or Atlanta';
$tab = $client->mytest->stuff;
$tab->insertOne($dat);

Dec 18 - MongoDB (with PHP, etc.)

I started using relational (SQL) databases in 1997. Finally in the last few years, though, I've seen a glimmer of the appeal of OO / schema-less / noSQL / whatever databases such as MongoDB. For the last few months I've been experimenting with Mongo for my personal projects. I'm mostly liking what I'm seeing. I haven't quite "bitten" or become sold, but that's probably coming. I see the appeal of simply inserting an object. On the other hand, I've done at least one query so far that would have been far easier in SQL. (Yes, I know there are SQL-to-Mongo converters, but the one I tried wasn't up to snuff. Perhaps I'll keep looking.)

I've been using Robo 3T (v1.1.1, formerly RoboMongo) as the equivalent of MySQL Workbench. I've liked it a lot. In vaguely related news, I found it interesting that some of the better Mongo-PHP examples I found were on Mongo's site and not PHP's. The PHP site seems rather confused about versions. I'm using the composer PHP-Mongo library. Specifically, the results of "$ composer show -a mongodb/mongodb" are somewhat perplexing, but they include "versions : dev-master, 1.3.x-dev, v1.2.x-dev, 1.2.0 ..." At the MongoDB command line, db.version() == 3.4.7. I don't think Mongo 3.6 comes with Ubuntu 17.10, so I'm not jumping up and down to install "the hard way," although I've installed MDB "the hard way" before.

Mostly I'm writing this because I've been keeping that PHP link in my bookmarks bar for weeks. If I publish it, then I don't need the link there in valuable real estate. Although in a related case I forgot for about 10 minutes that I put my Drupal database timeout fix on my web site. Hopefully I'll remember this next time.

Dec 17, 2017

Today's entry 2 - yet another Google Apps Script / Google Calendar API error and possible Google bug

I solved this before I started the blog and wrote about the other errors below. The error was "TypeError: Cannot find function createAllDayEvent in object Calendar." This was happening when I called "CalendarApp.getCalendarById(SCRIPT_OWNER);" twice within a few lines (milliseconds or less) of each other. The failure rate was something like 10 - 15% until I created the global. The solution is something like this:

var calendarObject_GLOBAL = false;

function createCalendarEntry(summary, dateObject) {
	var event = false;
	event = calendarObject_GLOBAL.createAllDayEvent(summary, dateObject);	
}

calendarObject_GLOBAL = CalendarApp.getCalendarById(SCRIPT_OWNER); // calendar object

createCalendarEntry('meet Bob at Planet Smoothie', dateObject123);

I'm not promising that runs; it's to give you the idea. Heaven forbid I post proprietary code, and there is also the issue of taking the time to simplify the code enough to show my point. I should have apprentices for that (hint, hint).

I was getting errors when I called CalendarApp... both inside and outside the function. I suspect there is a race condition bug in Google's code. We know the hard way how fanatical they are about asynchronicity. Sometimes that's a problem.

Yes, yes. I'm being sarcastic, and I may be wrong in my speculation. I understand the benefit of all async. But isn't part of the purpose of a blog to complain?

Today's entry 1

I just updated my Drupal database connection error article

Dec 6, 2017 - today's entry 2 - fun with cups and Drupal runaway error logs

I just discovered that /var/log/cups was using 40GB. Weeks ago I noticed cups was taking 100% of my CPU (or one core, at least) and writing a LOT of I/O. It was difficult the remove it entirely. The solution was something to the effect of removing not only the "cups" package but the cups-daemon. cups is a Linux printing process. I haven't owned a working printer in about 6 years, and I finally threw the non-working one away within the last year.

I've had the same runway log problem with Drupal writing 1000s of warnings (let alone errors) to "watchdog." It took me a long time to figure out that's why some of my Drupal processes were so slow. It seems that Drupal should simply stop logging errors after a certain number of iterations rather than trash the disk for minutes. If I cared about Drupal, perhaps I would lobby for this, but I have come somewhere close to despising Drupal, but that's another story for another time.

Dec 6, 2017 - fun with systemd private tmp directories

This happens when you just want to use /tmp from Apache, but no, you get something like /tmp/systemd-private-99a5...-systemd-resolved.service-Qz... owned by root and with no non-root permission. (Yes, yes, I have root access. That's not the point.) Worse yet, there are bunch of such systemd directories, so which one are you looking for? Yes, yes, I'm sure there is a way to know that. Also not the point. The point is: please just make it stop!

Solution (for Ubuntu 17.10 Artful Aardvark)

  1. with root permission, open for editing: /etc/systemd/system/multi-user.target.wants/apache2.service
  2. Modify this line from true to false: PrivateTmp=false
  3. run this: sudo systemctl restart apache2.service
  4. I don't think you need to restart apache (see note below), but I'm not sure. I did restart Apache, but I didn't try it without restarting Apache.

Notes

I don't even know if restarting the apache2.service is the same thing as restarting Apache or not. On this point, it is worth noting that sometimes you have to stop going down the rabbit hole, or you may never accomplish what you set out to do. Yes, I should figure out what this systemd stuff is. Yes, I should know if the apache2.service is separate from Apache. One day. Not when I'm trying to get something very simple accomplished, though. Also, yes, I understand the purpose of a root-only private directory under /tmp. Yes, I understand that /tmp is open to all. But none of that is the point of this entry.

If you can't tell, I'm a bit irritated. Sometimes dev is irritating.

For purpose of giving evidence to my night owl cred, I'm about to post at 2:24am "my time" / EST / US Eastern Standard Time / New York time / GMT -5 / UTC -5.

2017, Nov 14 (entry 5)

I did launch with entry 4.

I just took an AWS EC2 / EBS snapshot of an 8GB SSD ("gp2") volume from my Kwynn.com "nano" instance at US-east-1a. With my site running, it took around 8 minutes. The "Progress" showed 0% for 6 - 7 minutes, then briefly showed 74%, then showed "available (100%)." It ran from 2:55:34AM - around 3:03am. My JS ping showed no disruption during this time. CPU showed 0%. I didn't try iotop. (Processing almost certainly takes place almost if not entirely outside of my VM, so 0% CPU makes sense.)

This time seems to vary over the years and perhaps over the course of a day, so I thought I'd provide a data point.

Entry 4 and launch attempt 2

I wrote entries 1 - 3 at the end of October, 2017, but I have not posted this yet. I'm writing this on Friday, November 10 at 7:34pm EST (Atlanta / New York / GMT -4). I mention the time to emphasize my odd hours. See my night owl developer ad.

I'm writing right now because of my night owl company (or less formal association) concept. My potential apprentice whom I codenamed "Mr. 4.6 Hours" has been active the last few days. I'd like to think I'm getting better at the balance between lecturing, showing examples, and leaving him alone and letting him have at it. I think he's making progress, but he's definitely making *me* think and keeping me active. Details are a longer story for another time. Maybe I'll post some of my sample code and, eventually, his code.

He's not around tonight, and I miss the activity. As I said in the ad, I'd like to get to the point that I always have a "green dot" on Google Chat / Hangouts or whatever system we wind up agreeing on.

Based on the last few days, I have a better idea of how to word my ad and the exchange I want with apprentices. Perhaps I'll write that out soon.

dev rules 1 and 2

Rules 1 and 2 and in entries 1 and 3, respectively, below.

Rules 3 and 4 are way "above" / later.

Entry 3: dev rule #2

My first GAS and perhaps the 2nd, if it is indeed a server problem, bring up my rule #2:

Kwynn's software dev rule #2: always host applications on a site where you have root access and otherwise a virtual machine--something you have near-total control over. It should be hard to distinguish your control of the computer sitting next to you versus your host.

Amazon Web Services (AWS) meets my definition. AWS is perhaps one of the greatest "products" I've ever come across. It does its job splendidly. When they put the word "elastic" (meaning "flexible") in many of their products, they mean it.

Others come close. I used Linode a little bit; it's decent. I have reason to believe Rackspace comes close. I am pretty sure that neither of them, though, allow you to lease (32-bit) IP addresses like AWS does. I am reasonable sure getting a 2nd IP address with Linode or Rackspace is a chore--meaning ~$30 and / or human intervention is involved, and / or a delay. With Amazon, a 2nd IP address takes moments and is free as long as you attach it to an (EC2) instance.

This rule is less absolute than #1. Violating always leads to frustration, though, and wasted time. Whether the wasted time is made up for by the alleged benefits of non-root hosts is a question, but I tend to think not. I've been frustrated to the point of ill health, though--one of the very few times I've *ever* been sick. That's a story for another time, though.

If it's not clear, using GAS violates the rule because of the situation where there is nothing you can do. I had some who-knows-the-cause problems with AWS in late 2010, but I've never had a problem since. If, heaven forbid, I did have a problem, I could rebuild my site in another Amazon "availability zone" pretty quickly. As opposed to just being out of luck with GAS.

Why I violate the rule with GAS is another story, perhaps for another time. I'll just say that if it were just me, I'd probably avoid GAS. With that said, some time I should more specifically praise some features of GAS as it applies to creating a Google Doc. I was impressed because given the business logic limitations I was working with, GAS was likely easier than other methods.

Entry 2: Google Apps Script and StackOverflow.com

I've been considering a blog for months if not years. I finally started because of this problem I'm about to write about.

This blog entry deals with both the specific problem and a more general problem.

The specific problem was, in Google Apps Script (GAS), "Server error occurred. Please try saving the project again". The exact context doesn't really matter because if you come across the problem, you know the context.

I spent about an hour chasing my tail around trying variations and otherwise debugging. At some point I tried to find info on Google itself. Google referred "us" to StackOverflow.com (SO) with the [google-apps-script] label. Google declares that to be the official trouble forum. As it turned out, someone else was having the same problem. I joined SO in order to respond. Then roughly 4 others joined in. We were all having the same problem, and nothing we tried fixed it. I am 99% sure it was a Google server problem and there was nothing we could do. The problem continued during that night. Then I was inactive for ~14 hours. By then, everything worked.

The more general problem I wanted to address is the way SO's algorithms handled this. The original post and my response are still there several weeks later. However, others' perfectly valid responses were removed. To this day, SO still says, "Because [this question] has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site..."

This sort of algorithmic failure troubles me. I'd like the memory of those deleted posts on the record.

I was motivated to write about this because I encounted another GAS error a few hours ago that I once again suspect is a server error. This time, I was the one who started the thread. 2 hours later, no one has answered. I'm curious how this turns out. I'm not linking to the thread because it's still possible I caused the problem. Also, I'm not linking to it because Google almost immediately indexed it, so SO is the appropriate place to go.

Entry 1: dev rule #1

Kwynn's Software Dev Rule #1: Never develop without a debugger. You will come to regret it. To clarify terms, by "debugger," I mean a GUI-based tool to set code breakpoints, watch variables, etc. Google Chrome Developer Tools "Sources" tab is a debugger for client-side JavaScript. Netbeans with Xdebug is a debugger for PHP. Netbeans will also work with Node.js and Python.

It is tempting to violate this rule because you think "Oh, I'll figure it out in another few minutes."

Another statement of this rule is "If you're 'debugging' with console.log or print or echo, you're in big trouble."

page history

HTML5 valid