|
<Previous
Next>
If it seems like the last few of these "Kirt's Cogitation" installments were
on the subject of AI (Artificial Intelligence), there is a good reason: the last
three have been. My motivation stems from having enlisted the services of many different
AI engines over the last year or so for the primary purposes of attaining historical
data on notable persons and events, receiving explanations of science and engineering
principles - including equations - and soliciting software code snippets and/or
having code snippets analyzed for efficiency. Results vary from spot-on excellence,
to devastatingly wrong. That goes for all realms of inquiry.
Other of my writings have lamented and criticized the number of online calculators
that produce glaringly incorrect results, or are off by one or more orders of magnitude
(usually due to a unit multiplier error).
Being the purveyor of online calculators, spreadsheets, and software that performs
the aforementioned type of calculations, and knowing the increasing tendency people
are having to use AI for getting answers to all sorts of questions, I presented
the following to ChatGPT5 today (September 26, 2025):
Me:
"AI engines have been quite helpful in my engineering work. However, ChatGPT
and other AIs very often present invalid responses to objective queries (equations,
pertinent dates, software code, etc.). There are also many online calculators that
produce erroneous results for science and engineering queries. Given that reality,
what is the general sense among science and engineering types regarding which is
more useful / reliable: AI, online calculators, Windows software programs (not necessarily
by Microsoft)? "
ChatGPT5: (exact response)
"Short answer: For work where numerical correctness and traceability matter,
engineers generally trust validated, domain‑specific software and well‑tested code
far more than general‑purpose AI or ad‑hoc web calculators. AI is useful as a copilot,
not a source of ground truth. Web calculators are convenient but uneven - use only
those that cite equations and standards and still verify.
1. How practitioners tend to rank tools by reliability (for objective results)
- Validated engineering software and/or your own scripts using reputable libraries
- Examples: FEM/CFD/CAD/EDA packages; MATLAB/Octave/Python/Julia with NumPy/SciPy/etc.
- Why: documented verification/validation (V&V), regression tests, published
benchmarks, versioning, and auditability of assumptions, units, and inputs.
- Caveat: You can still be wrong via bad models/boundary conditions; use built‑in
error estimates and cross‑checks.
1. Reputable online calculators that show their work
- Useful for standard formulas (beam deflection, pipe flow, psychrometrics, basic
statistics).
- Look for: displayed equations, unit control, references to standards/texts,
version/date, and disclosed assumptions/ranges.
- Avoid: calculators that hide formulas, don't handle units, or don't cite sources.
Always sanity‑check against a known case.
1. General‑purpose AI chatbots (including code assistants) for objective facts/numbers
- Strengths: speed for brainstorming, locating references, summarizing standards,
writing starter code/tests, unit conversions (when explicitly specified), deriving
well‑known formulas step‑by‑step.
- Weaknesses: hallucinations, silent assumption changes, sign/unit mistakes, outdated
data, inconsistent numerics. Not acceptable as sole authority in regulated or safety‑critical
work.
- Safe pattern: use AI to draft; you validate with independent methods/tools.
Why this hierarchy persists
- Traceability and V&V: Engineering decisions require a defensible chain from
inputs to outputs. Desktop/domain software and reproducible scripts provide that;
most AI/chat UI does not.
- Known error modes: With calculators/AI you often can't see the implementation
or test suite. Mature software and libraries usually publish test coverage and benchmarks.
- Organizational policy: Many engineering orgs permit AI for assistance but require
human review and independent verification before use in design/analysis.
Practical guidance you can apply
- Match tool to risk:
- High consequence or regulated: validated domain software or well‑tested code,
plus an independent check (analytical limit, alternate solver, or vendor hand calc).
- Medium: scripts/notebooks with reputable libraries; unit tests against textbook/standard
examples.
- Low/early scoping: calculators or AI for speed, then spot‑check.
- Triangulate: use two different methods/tools; expect agreement within a pre‑set
tolerance.
- Be explicit with units/standards: specify unit systems, constants, and the standard/revision
(e.g., ASME, AISC, ISO, IEC) used by the tool.
- Prefer tools that:
- Show equations and assumptions
- Cite references and versions
- Handle units rigorously
- Offer uncertainty/error estimates and mesh/convergence controls (for numerical
solvers)
- Have a visible validation suite or user community benchmarks Build your own
"golden cases": a small set of problems with known answers to quickly vet any calculator/AI
output or new software version.
Bottom line
- For reliability: validated domain software and reproducible code > reputable
calculators that show equations and sources > general‑purpose AI for numbers/facts.
- For productivity: AI is valuable as a drafting/exploration/copilot tool, but
its outputs should be treated as hypotheses to verify, not authoritative results."
Here, I ask you whether you noticed what I did regarding ChatGPT5's response?
Me:
"In your ranking of the three options, you listed them all as "1."
See what I mean? "
ChatGPT5:
"You're absolutely right - my formatting was sloppy, and that undercut the clarity
of what I was trying to convey. Thank you for calling it out. Let me reorganize
this clearly:
Ranked by Reliability for Objective Engineering/Scientific Work:
- Validated Engineering Software & Reputable Scripts/Libraries
Examples: ANSYS, SolidWorks, MATLAB, Python with NumPy/SciPy, well-tested proprietary
tools. Why most reliable: These are built specifically for technical use, with
rigorous validation, error handling, version control, and traceability. They are
designed to produce repeatable, accurate results under documented constraints.
- High-Quality Online Calculators (from trusted sources)
Examples: NIST databases, engineering toolboxes from professional societies, university-hosted
calculators with visible equations. Why middling reliability: Good ones can
be excellent for standard calculations but vary widely in quality. Always verify
against a known result or textbook example. They lack the full rigor of dedicated
engineering software.
- General-Purpose AI Chatbots (like ChatGPT, etc.)
Examples:
ChatGPT, Claude, Copilot. Why least reliable for raw objectivity: They are designed
for language generation, not mathematical or scientific precision. They frequently
hallucinate numbers, constants, and equations. They should only be used as brainstorming
aids or for structuring code - never as a source of ground-truth calculation.
So, to restate clearly:
Most reliable → 1 (Specialized software) → 2 (Reputable calculators)
→ 3 (General AI) → Least reliable
Thanks again for pointing that out - clarity is everything in technical fields.
Does that lineup better with what you've observed in practice? "
QED - Kirt Blattenberger
|