Vowels are still phonemic in Japanese. Syllables in the language are (ignoring length and [N]) either V or CV. Ume ‘plum’ has two morae, and ii ‘good’ also has two, with no consonants at all. The fact that every syllable is of the form [C]V does not mean that the phonemes are of that form, just that the possibilities for combining phonemes into syllables are more heavily restricted than in English. Phonemes are atomic; if they can be split into distinct, recognizable, meaningful sounds, then they’re not a single phomeme. English [ng] is always preceded by a vowel and must occur as the coda in a syllable, but the phoneme is [ng] itself, not [ang], [ing], etc.
The Japanese do consider ‘kono’ and ‘sono’ to have four separate sounds. It’s just that Japanese has only about 15 consonants and 5 vowels and requires every syllable to be of the form V or CV, so there are only 90 or so possible syllables. It’s thus easier to make a writing system with those 90 syllables rather than making a distinct symbol for each separate consonant and vowel, where you’d have only 20 or so symbols but words would be twice as long to write. (In practice, it’s more complicated than that. Consonants and vowels can also be geminated, which is indicated by vowels by writing the pure vowel after the syllable and by consonants with a special symbol preceding the syllable. There’s also a moraic nasal [N] that is its own syllable and can’t be geminated. On the other hand, a few CV syllables are forbidden (or, really, reduced when they occur), so that shrinks the syllabary a bit. Also, the writing system tends to use Chinese characters for roots and hiragana for inflections.)
I suppose you could try to dismiss vowels and consonants altogether and claim that the language has just 90 or so phonemes like [a], [ka], [sa], etc., but you’d run into problems quickly. Every single phoneme in this language would be syllabic, with no consonants at all in the language. It’s clear that [ka], [ki], [ku], etc. are very similar, but there’s no explanation for why the languages has so many phonemes of that form. Phonetically, the features of those sounds would be bizarre. You might be able to categorize [na], [ni], etc. as prenasalized vowels, but [chi], [su], [ra], etc. would be far more difficult to explain, and there wouldn’t be any similar phonemes in other language. Japanese morphology would be far harder to explain. Simple roots are of the form CVC with inflections of the form -V… added, making them legal words. Using your method, we’d have to reanalyse those changes as dropping the final phoneme, then adding another that varies a lot but remains suspiciously similar to the one that was dropped.
In short, it would be impossible to do linguistics with the language, and that form of Japanese would look like no other language on earth. That isn’t to say that phonemes are necessarily set in stone, though. Some treatments of Japanese analyse geminate consonants like ‘pp’ as being combinations ‘Qp’, where Q is some sort of doubling morpheme. (I’m not sure whether the analysis is that it assimilates to the following consonant, triggers doubling then gets deleted itself, or what.)