[JMR202108181958 -- adding summary:
My own sisters don't want to read this because I'm speaking too much Geek. Re-reading it, I guess they're right. And what I wrote seems to wander all over without apparent reason -- not apparent unless you already know what I'm trying to say.
So, I guess I should put a high-level summary up front here:
Up until twenty years ago, common wisdom was that IBM picked the "right" CPU for the IBM PC. Only crackpots like me thought differently. But the evidence mounts, and now the common wisdom is that IBM picked the "wrong" CPU for the "right reasons". 
And then you still hear lots of things that don't match reality. Or, at least, I still hear a lot of things that don't match what I know and remember on the subject. That's part of the reason I wrote this rant, to tell what I remember of things.
But you never hear what I think is the real reason, and that's the real reason I wrote this.
The short version of what I understood at the time was the reason -- and I have seen no real evidence to the contrary -- is this:
(1) IBM didn't choose Motorola's 6809 because Motorola did not support it like they should have. Motorola didn't support the 6809 like they should have because they were afraid of it eating into the 68000's market.
(2) IBM didn't choose Motorola's 68000 because they (IBM) were afraid that a personal computer based on the 68000 would eat into their market for their mid-range (minicomputer-class) computers based on the System 3.
That's the conclusion, and the rest of this rant is about where that conclusion comes from.
] 
You need to understand the situation in 1979-81 clearly.
You have to remember that IBM was not officially considering entering the personal/home computer manufacturing business. 
In another company, the project to develop the PC might have been called a skunkworks project. But the PC project had even less official recognition. Yes, they were working separately from the main company, yes, management kept hands-off. (Some had even washed their hands of it.) Work was performed in secrecy, and it was not only started without contract or official directive, it was mostly complete by the time upper-level management acknowledged it.
IBM already had their blue-sky research projects, which was something more akin to the Skunk Works at Lockheed. This was different.
Not to say that Skunk Works and blue-sky were completely free from adversarial management, but the PC project was pursued in a much more adversarial management environment. The engineers who built the initial prototype were permitted to do so by their manager, who acted in specific contradiction to direction from the next-up level of management. 
At least, that's the story I heard several times while working internships for IBM, and those stories matched what I was seeing, where later stories do not.
Of course, those who let the project move ahead were, to more-or-less extent, putting their own jobs on the line for the results. 
IBM's marketing and engineering did not want to deal with the threat of microprocessors in general-purpose computing devices. The attitude I heard was, of course microprocessors can't do the job. They are strictly for controls devices and calculators. 
And it wasn't exactly a mistaken attitude. All existing microprocessors at the time were missing elements that were important to general-purpose computing -- memory management hardware, direct memory access input-output channels, a hardware timer for dividing the CPU's time between tasks, proper hardware division of task resources, .... And the list goes on. 
Microprocessors are still missing much of that list. But even the "big" computers of the time didn't have all these things in place, either. So it was, in fact, ignoring reality. 
I will mention this attitude again further down, but this is enough to get a feel of how things were at IBM.
Apple and Commodore's history is pretty well known. That is to say, I was not close enough to them to add much, so I won't. Everyone knows that Apple IIs were selling well in business and education markets, and Commodore's offerings were just behind in business and education, but were ahead in personal/home use sales, and eating away at the dedicated games machine market.
But Radio Shack's history is not so well known. This should not be surprising. Radio Shack really didn't have an approach to write about. The TRS-80 sort of fell into their lap. 
(Again, I was listening to local management discuss things while it was happening. I was trying to be a Radio Shack salesman in Odessa, Texas when the first TRS-80s were delivered. I got to unpack our demo unit and write something up as a demonstration program.) 
The guy who designed the Z-80-based TRS-80 original model (now called Model 1) just kludged some stuff together from demonstration circuits published by Zilog and hobby circuits from the hobby industry. I saw the circuit diagrams, and I knew enough to tell where some of the short-cuts had been taken that were a bit beyond specifications for the parts. It was intended as a proof-of-concept, but Radio Shack had no engineers at the time to actually fix the design, and had no motivation to do so, either. It was part of their culture, and they were looking to compete with folks like Tramiel.
Now, I took a break from the technical world during 1979 and 1980, to serve God according to my understandings and belief. When I returned, Radio Shack had hired a few engineers and was half-heartedly trying to clean up the Z-80-based model designs. 
And they had had another personal/home computer design fall into their lap, the M6809-based Color Computer. This one was a Motorola example circuit, with just a little modification. 
The 6809 was potentially more powerful than the Z-80, powerful enough to go head-to-head with the 8088 in certain applications, powerful enough to build a minicomputer-class microcomputer with. But Radio Shack didn't have the engineers or the motivation to do anything but sell the thing as a game machine -- or as a toy for geeks. 
Microware had been able to get their OS-9/6809 operating system running on the Color Computer, and it was suddenly a serious business and/or industrial controller class machine -- a seriously (woefully) hobbled business machine, but squarely in the same class as the IBM-PC would shortly be released for. So Radio Shack let some engineers cobble together a floppy controller that could support OS-9/6809 and could be plugged into the game cartridge port -- instead of investing in designing a true business-class machine as an upgrade to the Color Computer.
Admittedly, Motorola was not really supporting the 6809. They were busy selling into the microcontroller market as many true 8-bit 6805s and mixed 8/16-bit 6801s as they could manufacture, and they considered their future to be riding on such smaller microcontrollers and on the 68000.
We should remember that the nascent PC market was not nearly as important to Motorola (or to Zilog) as it was to Intel or Commodore. Motorola sold several orders of magnitude more microcontrollers than any CPU manufacture has ever sold microprocessors for personal computers. And it was hard to argue with their attitude. It would be fifteen years down the road before mind-share issues would be eroding their position in the controls market enough to get the full attention of Motorola's management.
Focusing on controls was not a mistake for Motorola. Not seeing the mind-share problem was. And only a very few engineers anywhere were seeing the potential uses for PCs as communications devices at the time, so it's not too surprising that Motorola didn't recognize why the PC was important to their future.
That was how things stood when I returned in late 1980, intending to work my way through school as an electronics tech.
So, Radio Shack needed to have better engineering to do anything with the better CPU technology that had fallen into their lap -- twice. 
Commodore's Tramiel wasn't the only one in the industry who thought engineering should be sacrificed for price and immediate profits, and was not the only one who found himself left to drink from the marketing stream he himself had polluted. 
So what about engineering? Was it really so obviously necessary back then as it seems in hind-sight?
Motorola was fighting with engineering missteps in the 68000. Intel was struggling with over-engineering the iAPX 432 (and under-engineering the 8086, but ...). Zilog was struggling with similar problems with their Z8000. 
The first 32-bit addressing CPUs (microprocessors or not) all bore the marks of engineering specs that were way too ambitious in application areas that were poorly understood -- too much engineering without foundation in real-world experience. (We're still doing it.) 
Consider this -- the 32-bit CPUs existing before the 68000 were not microprocessors, of course. They were all leading-edge hardware in mainframes, and the companies that produced them were protecting their inventions and technology with strict secrecy. 
Large memory systems, sharing a processor between multiple tasks, and coordinating the work of multiple processors were all brand new application fields for most of the industry. 
Real technology was just not available. That is, what was available other than theoretical technology was really hard to come by. And when you engineer things that complex with little real-world experience to guide you, you're going to make mistakes.
Insane competition drove the industry to push ahead into the 32-bit world a lot harder and faster than was safe or wise. In some senses, IBM's and Motorola's hesitation made sense.
Usually, you hear that IBM's options to the 8086 were Motorola's 68000 and Zilog's Z8000. Of the three, the obvious choice was the 68000, and the usual question is why it was not chosen.
Remember, nobody knew what they were doing. 
The 68K was the best of the actually 
available options, but Motorola's design for handling the big memory 
space was too ambitious, relying on false understandings of the 
underlying problems. That was where the "bugs" were, although nobody 
really had a better handle on it at the time. 
Three specific design 
misfeatures in the 68000: 
(1) Complexity, and the expectations that induced, although that didn't get really out of hand until the 68020. 
Nowadays, the 68000 seems relatively tame, but it was at least an order of magnitude more complex than, say, the 6809, and we lacked testing and design tools for the complexity then. 
You could use 8-bit engineering on the 68000, but you had to ignore all the cool features of the CPU to do so. That was hard for engineers designing for the 68000 to stomach, not to mention for management and marketing to justify. 
I think it was less the cost of supporting it all than having to plan on starting with a machine that both you and the customer would know was going to be replaced by a more complete re-design within the year.
So, why not build the more complex design to start with? That was what a lot of tech companies tried to do, and what a lot of them failed at.
[EDIT 20240422: 
IBM themselves had a department trying to build a real computer based on the 68000 at the time, the IBM Scientific 9000 or 9000 Scientific, depending on what docs you were reading.
]
It took too long to test, especially to the expectation levels we had then. We were too afraid of bugs in non-critical software and hardware. We were missing tools, and management was too scared of sinking money into the projects to fund developing them.
We are now used to the market itself being a required testing stage, but we weren't used to that idea back then. (Think about how your "smart" phone does so many things you don't want it to now. That's misfeatures, you know -- bugs -- that you are testing.)
For the record, note that the 8086 was about half an order of complexity more complex than the 6809, although it was less complex than the 68000.
(2) The original 68000 was not directly fully 32-bit addressing when using position-independent code. (Absolute addressing code, yes. Position independent, almost, but requiring software shims for modules larger than 64K.) There were missing 32-bit constant offsets in certain addressing modes. 
Position
 independence was a great plan, and Motorola should be commended for embracing it. But by only giving offsets of +/- 32K in those indexing
 modes, Motorola had built the 68K with a hidden barrier to 
overcoming the 64K module size barrier when designing for position independent code. You had to use two instructions and a register (a 32-bit load immediate in addition to the register-offset indexing mode of the instruction you wanted) to get full 32-bit range with position independence. 
This is most of the reason Codewarrior for Mac had a small memory model similar 
to the x86's small memory model. If the 68000 had had full-range constant offsets, 
the memory models for the 68000 could have been blended, and programmer 
wouldn't have had to plan for it, and the small model would have disappeared as the compiler matured.
Imagine telling your manager about that, after you successfully 
lobbied for the 68000 instead of the cheaper, but less capable 8086, because the 8086 had the barrier and the 68000 didn't. (Then think about trying to explain why position independence, which wasn't really achievable on the 8086, was important.) 
Well, two instructions and one data register on a CPU with 8 data registers is not nearly the impediment that four-to-eight instructions and the only accumulator the CPU has on a CPU with only one accumulator. The small and big models on the 68000 could have been blended anyway, but we (the market) had this antipathy to compilers that took more than two optimizing passes and then added another optimizing pass in the link phase. 
(It would be a few years before we as an industry generally recognized that optimizing too early was a big problem, but that's a rant for another day.)
(3) Motorola made some problematic design decisions in how the 68000 handled exceptions in the intransigent case of memory page faults and such. As a result, there was not enough information about the cause of the exception to recover and continue, which made it difficult to share memory between running applications safely.
Intel just 
didn't handle it at all in the 8086, which left them able to quickly recover from
 the mistakes of trying to implement too much complexity in the 80286 
when they moved to the 80386. Most companies who needed to deal with memory exceptions for the 80286 said they would try to implement the exception stuff in their next 
software upgrades, but by the time they were getting started on the next 
upgrades, Intel had the design for the 386 ready. The 80386's approach was much simpler, 
and nobody needed to go down the 80286 rabbit hole any further.
Motorola
 tried to handle those exceptions in the original 68000, but got it slightly wrong, and that, more than 
the extra clock cycles, was what kept the 68451 from being the MMU in 
any of the major workstations built on the 68K. Engineers understood the
 problems of wait states, and expected that the newer versions would be 
able to run with fewer wait states. They could expect wait states to be 
handled in hardware. But the exception handling misfeature meant having to plan on 
re-writing the very code that they thought they only wanted to write 
once. 
Motorola did fix that in the 68010, which was released to the market in 1982. Unfortunately, they did not fix the 32-bit constant offset problems until the 68020.
Now, it should be noted that, ultimately, problems in the code for handling memory management is 
still biting us in the form of microcode vulnerabilities, and somebody has to regularly rewrite it. (Remember the Heartbleed vulnerabilities, for example?) ARM64 suffers 
less than AMD/Intel64, but the CPU vendors are still struggling with it. It's a very difficult problem to solve well.
Which means 
that the mistakes in the 68000's exception handling were not really a 68000 problem,  they were a general problem for the whole industry. But everyone still likes to call it a 68000 problem, because no one really
 wants to admit we as a race don't already know everything we need to 
know for handling today's problems yesterday.
Now, Intel did have an advantage of sort-of learning from their 
misadventures with the iAPX 432. That is, they just decided to punt on the problem with the 8086, and gave it non-enforcing segment registers -- which aren't really segment registers. It was a (not very good) non-solution with interesting and sort-of useful side-effects that cause more problems down the road.
The 8086 segment registers, instead of being the width of the 20-bit addressing of the 8086, were just sixteen bits slid over four, which made the segment registers clumsy to work with as base registers for large arrays and such. You ended up needing four to eight instructions and two or three registers to properly handle 20-bit pointers. 
And there were no segment limit registers, which is why I say they were not really segment registers. Implementing segment limits in software on the 8086 essentially doubled the already excessive instruction overburden.
In a sense, the pseudo-segmentation was used like a cheap alternative to bank-switching, avoiding the use of external hardware to achieve expanded memory range. Ironically, though,  until the 80386 was available, external hardware bank switching was still used in addition to those pseudo-segment registers on the 80x86-based PCs with large memory requirements.
The segment registers did not really solve the underlying problems, nor did they contribute to a proper external hardware solution, they just allowed (with the cooperation of the market) the problems to be swept under the rug until the 80386 was available.
Yes, as I mentioned above, the 68000 had sort-of similar problems, but not nearly to the same degree. If you needed segmentation on the 68000, it was just a matter of index modes, and even with the shortcoming with constant offsets, it only took two instructions and a register to take care of, not four-to-eight instructions. (You just had to ignore the obvious CHK instruction if your segments were going to be larger than 64K -- until the 68020, when that was fixed. But, back then, nobody wanted to waste time bounds-checking things anyway.)
(4) (Note, this is more than three.) The 8088's 8-bit external was cheaper! -- but no, not really. 
You may have heard such nonsense about the 8-bit external 8088 being cheaper to design for than the 16-bit external 68000. Let's calculate this: 
-- Yes, the entry level model would have had sixteen 16Kx1 dRAMS instead of just eight. 
No, that would not have broken the bank. The initial sticker price could have added the extra USD 32.00 or so without breaking any significant price barrier. (Go look up the original prices.) That differential would drop as production ramped up.
The RAM configuration would have been 16 chips wide instead of 8, but that was only a small routing problem -- eight extra wires. The max on-board RAM for the mainboard in the original version could have remained at 64K, although in configuration of 32Kx16 bits (32Kx2 bytes).
-- Yes, the expansion bus did present a problem. Catering to the 8088's limitations allowed sweeping the expansion bus width under the rug for a couple of years, limiting performance options that could otherwise have been had if the 8086 had been a planned option from the beginning. 
Wider connectors were more expensive, but only until ordered in large numbers. Total added to the initial price? Maybe USD 5.00. And that would disappear quickly as production ramped up.
Really, the narrower 8-bit expansion bus was solving a different problem than cost. 
-- The 8 kilobytes of BIOS ROM was done as 4 2Kx8 ROMS, and the only difference necessary would have been the 16-bit wide data bus configuration. 16-bit data meant pairing the ROMs, but 8 kilobytes is 8 kilobytes whether done as 4 unpaired ROMs or 2 pairs of the same size ROMs:
4x(2Kx8)== 8Kx8 bits => eight kilobytes
vs. 
2x2x(2Kx8) == 4Kx16 bits => eight kilobytes
Yes, the engineers would have had to take a little time learning the 68000's superior instruction set to be sure they weren't wasting space and cycles in the 4 2Kx8 BIOS ROMs, whereas they were already familiar with the 8086's instruction set. 
Marketing types with no patience to understand what they were talking about tended to say things like, 
But the instructions are 16 bits wide! That's going to be twice the memory to store programs!!!!!!
That is potentially a legitimate concern with 32-bit RISC instruction sets, which are usually not as densely encoded as either the 8086 or 68000. 
But instructions in the 8086 are variable width, in 8-bit chunks. Instructions in the 68000 are also variable width, in 16-bit chunks that do more work than the 8-bit chunks of the 8086. Motorola was careful there.
(Just for the record, neither the 8086 nor the 68000 is as densely encoded as the 6809, but if the 6809 instruction set is expanded to handle addresses larger than 16-bits, you lose some of that density.)
-- What else? Peripheral chips? 
The 68000 had instructions
specifically to enable using 8-bit wide peripheral parts without having 
to adapt the peripheral's 8-bit wide interface to the 68000's 16-bit 
wide data bus. It might have felt a little "awkward" or "non-ideal" to 
some engineers, but most experienced engineers would not have even blinked an eye. 
As a specific case-in-point, the original IBM PC used Motorola's 6845 to
 generate video. Motorola had a reference design for using that exact 
chip with the 68000 (and, indeed, hardware for the 68000 based off that reference design were not unknown in the industry).
(5) Lack of software isn't exactly a misfeature, but it is often invoked as a reason the 68000 wasn't ready.
Remember that CPM/86 had not entered into development when the IBM PC unofficial project began. Whatever CPU they chose, they were either going to be dependent on the CPU's manufacturer for an existing OS, developing their own, or getting a third party to develop one for them. Choosing the 8086, they initially turned to Digital Research. 
Note that it was already known that 8080 or Z-80 software needed more than just re-assembling the source code with an 8086 assembler. Transliteration was possible to an extent, but cleaning up the transliteration did take time. 
Considering the amount of Z-80 software that was quickly (and crudely) transliterated to the 68000, asking Digital Research to develop a version for the 68000 would not have been unreasonable. Likewise, they might have reached out to Technical Systems Consultants or Microware for a 68000 version of their OS products for the 6800 and 6809.
At this point, you should be able to see that all of the usually mentioned strikes against the 68000 were not strikes at all. Balls. The 68000 really should have walked the bases, so to speak.
Setting the 68000 aside for a moment, I've mentioned the 6809 a bit above. You may be aware that Apple considered the 6809 for the Macintosh, and even wired up a prototype before deciding that the Macintosh really needed more address space. 
I've heard, but have not corroborated, that the 6809 was also considered by IBM for their PC somewhere along the line. Would it have been a bad choice?
On the plus side:
(1) The 6809 was designed from the beginning to handle high-level languages, multi-tasking and such. I mentioned Microware's OS-9/6809 above. Uniflex from Technical System Consultants was also available in 1979. (TSC's Flex for 6809 was available almost as soon as the CPU was.) IBM would have had two relatively mature quality OSses and a good developers' environment and community from the outset, and much less aggressive partners to work with.
(2) Motorola did have a page-mode MMU part for the 6809, the 6829. External MMUs do tend to slow a processor down a little, but, with the 6829, the 6809 was able to compete with minicomputers -- minicomputers from the mid-1970s, but that's not bad considering that those mid-1970s minicomputers were still very actively used into the mid-1980s.
(3) Motorola had a floating-point ROM for the 6809 as well, the 6839. It was not as fast as floating point in hardware, but it was ready, and cheaper. 
(4) The 6809 was fairly well-known at the time for its ability to handle graphics. 
On the minus side:
(1) The 6809 did not have segment registers. Breaking the 64K barrier would have required bank switching or full paging, and doing large arrays with bank switching or full paging requires a bit more code than even with the 8086's sloppy segments.
(2) The 6809 did not have hardware divide. And hardware multiply was only eight bits wide, so you had to put four of those together to get a 16-bit multiply. That also slowed down the floating-point operations.
(4) Motorola was not talking about extending the design. They were focused on their profitable 6805 and 6801 CPUs.
That last was possibly the killer for the 6809.
If Motorola had been showing signs of actually incorporating the 6809 as a core in microcontrollers similar to the 6801 and 6805, IBM could have been confidant in being able to get them to build a 6809 with something like the 6829 memory management unit built-in, and integrated MMUs tend not to slow CPUs down nearly as much as external MMUs. 
And they could have had reason to expect Motorola to extend the 6809 architecture. Simply adding linear segment registers and full 16-bit hardware 
multiply and divide to the 6809 would have made it head-to-head competive with the 8086, in spite of the 6809's 8-bit architecture. Extending the architecture in something the way the 8086 was extended to the 80386 would have been no particular problem, either.
But such a 6809 would also, as the theory goes, have eaten into the 68000's market. 
(It is true that adding segment registers would have conflicted with the page-mode MMU operation that had become sort-of expected in 6809 software, requiring a small bit of work to bring, for example, OS-9/6809 up in a segmented model instead of a paged model.)
I personally think Motorola should have given the 6809 more attention, as I have mentioned elsewhere in this blog.
Anyway.
No, not anyway.
The Z8000 was not ready, and was not mature. 
The 6809 was ready, and was mature. It would have made a very good base for a PC, if Motorola had had plans to upgrade the CPU architecture in future offerings. Such plans were beginning to be obviously not in evidence.
The 68000 was ready, and Motorola was clearly committing to supporting and upgrading the architecture. Anyone who tells you otherwise either does not know or is ignoring the overall situation in the marketplace.
The 68000 was also 8-bit capable, contrary to what some have said.
Even if it would have taken an additional six months to a year to qualify, IBM never did a proper qualification of the 8088 PC design either, and there was no formal marketing study that defined a market window to be met or anything like that. And there really was no reason to expect it to take any longer.
It was not theoretical delays, not deficiencies in the 68000, per se, not anything like that.
So, drum roll:
Here is the real reason, from what I heard at the time, and I still believe it:
The
 68000 was powerful enough to allow building a microcomputer that would 
have competed with IBM's system 32/34/36/38 series of minicomputers. 
(Yeah. I saw that, when I was doing the internship with IBM. The system 3 CPU even looked a little like the 68000, superficially, inside. We
 could guess they'd have done better by moving the System/3 series to 
their own custom version of 68K, but thinking about that requires enough
 hypothesis contrary to fact to push us into the realm of writing alternative reality SF. 
And we can give a nod to IBM's first exercise in putting the system 360 on a microprocessor, while we're at it. Look up the IBM Personal Computer XT/370. That was real history.)
The x86 was not powerful enough for that. That left IBM's marketing team able to imagine they had breathing room.
The real reason was the same reason Motorola didn't support 
the 6809 the way they should have:
There was this meme that seemed to go around every marketing department back then --  
"MUST. NOT. COMPETE. WITH. OURSELVES!!!!!!!!!"
(I think we all now know that, when you're careful to avoid going to all-out war with yourself, competing with yourself keeps you from getting complacent.)
So the IBM PC project had to be kept hidden. And that is why it resulted in a non-optimal product. And the casual engineering of the x86 PC itself ultimately almost took IBM down with it, anyway. 
Microsoft and Intel would avoid the back-swill of their sins with a lot of planned smoke-and-mirrors, always (barely) able to make it look like they were leading the way out of the mess they themselves were creating. 
And that is probably as close as you can get to the real story of why bad engineering prevailed, 
but was only truly successful for Microsoft and Intel, both of whom are now eating and drinking the 
pollution they made -- along with the rest of us.
The real question is
Why did such a non-optimal design succeed? 
Full answer takes us deep into religion. I'm not going to ask you to go there with me today. Maybe some other day. 
Quicker answer, if only partial: 
The (business) world wanted spreadsheets along with the word processors. We didn't all know exactly what they were, 
but we wanted a bigger and better calculator that would allow us to do with our 
accounting books what word processors allowed with prose records. And we wanted 
it cheap, and we wanted it big. And we wanted it on our desks.
The Apple II got us close to giving us that, but spreadsheets on the Apple were not as intuitive as word processors were becoming, and were somewhat limited in size.
Both the x86 or the 68K supported more intuitive spreadsheet apps capable of handling larger spreadsheets, but the 68K was a threat to marketing departments.
That was why the IBM PC snowballed. Small, weak things are sometimes bigger and badder than big strong things. (That's the short version of the religious discussion, too, by the way.)
 
There really was only one way to have avoided the mess, and that was for IBM to have resisted the temptation to try to jump into the front of the race. (And I'm busy writing a not-very-good novel about that, when I'm not too tired after a day of delivering mail to keep food on the table, so I'll forego talking about that here.)